diff --git a/.github/README.md b/.github/README.md index 754e697..5c3cbba 100644 --- a/.github/README.md +++ b/.github/README.md @@ -13,7 +13,7 @@ I am happy to do so for you. ## Build Dependencies -Building requires 2 basic dependencies: +Building requires 2 widely available dependencies: - `discount` (markdown implementation) - `envsubst` @@ -28,8 +28,6 @@ Building requires 2 basic dependencies: ## Build script -You can then run. - ./build.sh ## Build with Docker @@ -37,7 +35,7 @@ You can then run. If you don't want to install anything in your enviornment we have included a convenience script which uses docker. - ./build_with_docker.sh + ./docker_build.sh diff --git a/.github/STYLE.md b/.github/STYLE.md new file mode 100644 index 0000000..2914dcb --- /dev/null +++ b/.github/STYLE.md @@ -0,0 +1,154 @@ +Style guide +=========== + +## File format + +The text should be standard markdown with the addition of the footnotes extension. +It should be readable in markdown form. +Avoid using HTML. +Simple HTML is allowed when there are no good alternatives. + +## Accuracy of text + +These are course notes, not a transcript. +I try to discern and communicate his intended message, +not necessarily make a historical record of the conversation. + +If Alex is sharing a strong opinion or giving a speech, +it is likely an exact quote (minus grammar and filler words). +The technical exposition is where +I have rearranged and filled in missing parts +for educational purposes. + +## Voice + +Try to match Alex's voice. + +## Exercises + +### When to include? + +Exercises may originate from Alex, or our own suggestion. +They should always reinforce material learned, +or offer challenges to prepare for upcoming material. + +Every chapter should have at least one. + +### How difficult shoult they be? + +They should not be obvious how to get started (avoid tricky puzzles) +but may take some programming time. + +### Provding solutions + +When exercises have a solution later in the text, +this should be indicated by a parenthetical statement such as. + +> Find the sum of integers from 1 to n (solved in chapter 3). + +Exercises may be solved in footnotes. + +## Links + +### When should I use inline links? + +Inline links should correspond exactly with their subject. +They should be obvious and only come from general sources. +For example, we might provide an inline link to a persons's Wikipedia page +or personal website when using their name. +Another example is linking to a documentation page when mentioning +an STL function like `std::sort`. + +However, when linking off to more detailed explanations, +or information from a specialized source, a footnote +with a brief explanation should be provided. + +### Why use inline links at all? + +Many of the links may seem unnecessary as they are a quick Google search away. +I added those for convenience and historical purposes. +References which are obvious now may be difficult to track down in the future. + +### Where should links be place in the markdown file? + +Links should be placed at the bottom of the section of the content they are related to. +This makes it easier to move content around and. + +## Story + +Each chapter should begin with an engaging story or opinion. + +## Code + +Each chapter should have links to completed code files at the bottom. +The code follows Alex's style including: + +- two space indent +- snake case (STL naming) +- { on same line as statement + +## References to literature + +Titles of books and articles are referred to with quotes: "The Art of Computer Programming". + +## Headings + +Only the first word of a heading should be capitalized. + +## Alex footnotes + +Occassionally Alex will make comments which are interesting but not related +to the lesson or directly applicable to the audience. +These can be moved to footnotes, but should be indicated +in the following manner: + + > Alex: here is a quote. + +No quotation marks are used. + +## Quotes + +Alex will sometimes use a voice as another person or group of people talking. +For example: + + You might say: "Who are you to tell us?" + +Quotation marks should be used in these cases. + +## Definitions + +When defining or introducing a new word, +it should be in **bold**. + +## Footnotes + +### When are foonotes required? + +When Alex references a piece of literature, individual, or event, we should find sources to support it. + +### When are optional footnotes appropriate? + +Footnotes can be added for any of the following reasons: + +- explain more about the material +- connect it to another subject or topic +- provide additional examples +- Provide supporting references. + +### What sources should footnotes refer to? + +When providing additional explanation or reference material, +we should prefer to use books which Alex recommends, over other materials. +These include: + +- "The Art of Computer Programming" +- "Structure and Interpretation of Computer Programs" +- cppreference vs cplusplus +- "Computer Architecture" (Brooks) + +## Math + +Indexed variables should have an underscore between the variable and index. `a_1`. +Otherwise, there might be confusion as to whether the numbers are part of the expression. + + diff --git a/00_foreword.html b/00_foreword.html index 4514b90..f5fdf84 100644 --- a/00_foreword.html +++ b/00_foreword.html @@ -2,116 +2,9 @@ + Foreword - + @@ -179,7 +72,7 @@

Foreword

as opposed to traditional black-box layering which merely hides information. Programs written this way also tend to be fast, -which contributes to its reusability. +which contributes to their reusability. If a component doesn’t perform, there will always be some performance-sensitive application that can’t use it. @@ -209,8 +102,7 @@

Foreword

To understand these catalogs and make contributions of their own, programmers must be educated in basic mathematics and computer science -in the same way that engineers require -physics and calculus.

+in the same way that engineers require physics and calculus.

– Justin Meiners

@@ -221,14 +113,16 @@

Acknowledgments

alex

-

Course notes assembled from videos, course materials, interviews, +

Course notes were assembled from videos, course materials, interviews, and books by Justin Meiners in 2021. The lectures were given in 2013 at A9. -These notes are intended -to share scientific information for educational and historical purposes. -This is a non-commercial project. -Most of the code comes from Ryan Ernst’s repo, -who attended the lectures.

+These notes are intended to share scientific information for educational and historical purposes. This is a non-commercial project. +Most of the code comes from Ryan Ernst’s repo, who attended the lectures.

+ +

Special thanks to Alastair Harrison for his significant efforts editing and providing feedback.

+ +

The following people also helped provide corrections: +Petter Holmberg, Ryan Pendleton, Frank Ettwein, Yuri Valentini.

FAQ

@@ -241,13 +135,12 @@

FAQ

The videos are often hard to watch due to the slow pace and interaction with the audience. Sometimes a mistake is made, and they go back and fix it. Alex may introduce a story and then finish it days later. -Consequently, some videos have fewer than 800 views, -and at least 10 of those views are mine.

+Consequently, some videos have fewer than 800 views, and at least 10 of those views are mine.

Is all this information available in his books?

-

A majority, but not all of the code, is available in other forms. -In the lectures, you get history, opinions, motivation, practical tips, applications, and responses to criticism, all at once. +

A majority of the information, but not the code, is available in other forms. +In these lectures, you get history, opinions, motivation, practical tips, applications, and responses to criticism, all at once. This is not present in the books, especially “Elements of Programming” which is very formal. Having this rich context makes the books much more approachable and meaningful.

@@ -266,7 +159,7 @@

FAQ

to follow. Try the first lesson.

-

How similar is Alex’s vision to the “modern C++” style?

+

How similar is Alex’s vision to the “modern C++” style?

The emphasis on value types and templates over dynamically allocated objects with virtual members is similar, @@ -296,19 +189,13 @@

FAQ

How accurate is this text?

-

These are course notes, not a transcript. +

The goal of this project is to provide course notes, not a transcript. I try to discern and communicate his intended message, not necessarily make a historical record of the conversation.

If Alex is sharing a strong opinion or giving a speech, -it is likely an exact quote (minus grammar and filler words). -The technical exposition is where -I have rearranged and filled in missing parts -for educational purposes.

- -

Many of the links may seem unnecessary as they are a quick Google search away. -I added those for convenience and historical purposes. -References which are obvious now may be difficult to track down in the future.

+I usually quote him exactly (with corrections to grammar) to avoid misinterpreting him. +The technical exposition is where significant rearrangements and have been made for educational purposes.

[ diff --git a/00_foreword.md b/00_foreword.md index eec021c..79c4a0b 100644 --- a/00_foreword.md +++ b/00_foreword.md @@ -50,7 +50,7 @@ code is reduced to its essential function, as opposed to traditional black-box layering which merely hides information. Programs written this way also tend to be fast, -which contributes to its reusability. +which contributes to their reusability. If a component doesn't perform, there will always be some performance-sensitive application that can't use it. @@ -80,8 +80,7 @@ and assemble them into applications. To understand these catalogs and make contributions of their own, programmers must be educated in basic mathematics and computer science -in the same way that engineers require -physics and calculus. +in the same way that engineers require physics and calculus. -- Justin Meiners @@ -94,14 +93,21 @@ Original course by Alexander. A. Stepanov. ![alex](img/alex.jpg) -Course notes assembled from videos, course materials, interviews, +Course notes were assembled from videos, course materials, interviews, and books by [Justin Meiners](https://github.com/justinmeiners) in 2021. The lectures were given in 2013 at [A9](https://en.wikipedia.org/wiki/A9.com). -These notes are intended -to share scientific information for educational and historical purposes. -This is a non-commercial project. -Most of the code comes from [Ryan Ernst's repo](https://github.com/rjernst), -who attended the lectures. +These notes are intended to share scientific information for educational and historical purposes. This is a non-commercial project. +Most of the code comes from [Ryan Ernst's repo](https://github.com/rjernst), who attended the lectures. + +Special thanks to [Alastair Harrison][aharrison] for his significant efforts editing and providing feedback. + +The following people also helped provide corrections: +[Petter Holmberg][petter], [Ryan Pendleton][ryanp], Frank Ettwein, [Yuri Valentini][yuroller]. + +[aharrison]: https://github.com/aharrison24 +[ryanp]: https://github.com/rpendleton +[petter]: https://github.com/petter-holmberg +[yuroller]: https://github.com/yuroller # FAQ @@ -113,15 +119,14 @@ and working code. The videos are often hard to watch due to the slow pace and interaction with the audience. Sometimes a mistake is made, and they go back and fix it. Alex may introduce a story and then finish it days later. -Consequently, some videos have fewer than 800 views, -and at least 10 of those views are mine. +Consequently, some videos have fewer than 800 views, and at least 10 of those views are mine. [videos]: https://www.youtube.com/watch?v=aIHAEYyoTUc&list=PLHxtyCq_WDLXryyw91lahwdtpZsmo4BGD **Is all this information available in his books?** -A majority, but not all of the code, is available in other forms. -In the lectures, you get history, opinions, motivation, practical tips, applications, and responses to criticism, all at once. +A majority of the information, but not the code, is available in other forms. +In these lectures, you get history, opinions, motivation, practical tips, applications, and responses to criticism, all at once. This is not present in the books, especially "Elements of Programming" which is very formal. Having this rich context makes the books much more approachable and meaningful. @@ -140,7 +145,7 @@ this presentation will probably be difficult to follow. Try the first lesson. -**How similar is Alex's vision to the "modern C++" style?** +**How similar is Alex's vision to the "modern C++" style?** The emphasis on value types and templates over dynamically allocated objects with virtual members is similar, @@ -172,16 +177,11 @@ comes from the speaker and is intentionally preserved. **How accurate is this text?** -These are course notes, not a transcript. +The goal of this project is to provide course notes, not a transcript. I try to discern and communicate his intended message, not necessarily make a historical record of the conversation. If Alex is sharing a strong opinion or giving a speech, -it is likely an exact quote (minus grammar and filler words). -The technical exposition is where -I have rearranged and filled in missing parts -for educational purposes. - -Many of the links may seem unnecessary as they are a quick Google search away. -I added those for convenience and historical purposes. -References which are obvious now may be difficult to track down in the future. +I usually quote him exactly (with corrections to grammar) to avoid misinterpreting him. +The technical exposition is where significant rearrangements and have been made for educational purposes. + diff --git a/01_data_structures.html b/01_data_structures.html index f4db6ec..a2d73cc 100644 --- a/01_data_structures.html +++ b/01_data_structures.html @@ -2,116 +2,9 @@ + 1. Choosing data structures and algorithms - + @@ -134,8 +27,8 @@

1. Choosing data structures and algorithms

Reflections on Trusting Trust

Ken Thompson did many wonderful things. Probably more -than any programmer, he influenced the programming style which we have right -now. While he did not invent C, that was done by his friend Dennis Ritchie, +than any programmer, he influenced the programming style which we have right +now. While he did not invent C (that was done by his friend Dennis Ritchie), he invented the programming style which underlies C. Dealing with pointers, knowing how pointers are subtracted, and stuff like that, all comes from Ken Thompson.

@@ -143,12 +36,12 @@

Reflections on Trusting Trust

Believe it or not, the best and brightest at that time were heavily on the march to get rid of pointers. Absolutely brilliant people who would give Turing award speeches. -Tony Hoare is the case in point saying that pointers -have to be abolished. +Tony Hoare is the case in point saying that pointers +have to be abolished1. The idea was that pointer is illegitimate thing. Language designers were fully convinced that even if you provide pointers nobody should call them pointers. -They would call them something like access types. +They would call them something like access types. They have to prohibit iterations, like you could never subtract pointers. Niklaus Wirth, the designer of Pascal, was very certain that you should never allow subtraction between pointers. @@ -159,19 +52,19 @@

Reflections on Trusting Trust

So Ken is an absolutely great man in many respects. His career started, not with UNIX, but when he was freshly out of -school coming up with the brilliant practical algorithm for matching regular expressions. +school coming up with the brilliant practical algorithm for matching regular expressions2. Every time you write grep or something like that you’re most likely exercising code written by Ken in the late 60s.

There is so much practical reliance on regular expressions. But they were invented believe it or not by theoreticians, specifically Stephen Kleene II who was a logician. -So Ken made them practical1.

+So Ken made them practical3.

Then he did UNIX, this totally brilliant operating system on which we all rely. All of our livelihoods come from Ken, in one shape or form. -Do you remember the fortune cookie program?
-One of the fortune cookies was: “it is all Ken’s fault”2. +Do you remember the fortune cookie program? +One of the fortune cookies was: “it is all Ken’s fault”4. We have to remember that.

After he does that, just to demonstrate that he can do anything, he decides to show @@ -180,7 +73,7 @@

Reflections on Trusting Trust

barely knew the moves. There was total dominance by Russian chess playing program Kaissa and two or three years he builds this specialized hardware, -totally revolutionizes the approach to just playing. +totally revolutionizes the approach to just playing. Of course nobody remembers what Kaissa was. He is a guy who could enter in a @@ -192,19 +85,19 @@

Reflections on Trusting Trust

recommend that you read called “Reflections on Trusting Trust” which talks about many things. But, I’ll use just one little episode in the beginning which is very important from my point of view. -He says that he was very blessed with collaborators specifically with Dennis3. +He says that he was very blessed with collaborators specifically with Dennis5. Then he continues that during many years of their collaboration not once they would attempt to do the same thing. They had this good sense to work on complementary things. Except once. Both of them needed to write an assembly program with about -20 lines of code and continues Ken, “I checked it was character by character +20 lines of code and continues Ken, “I checked, it was character by character identical”. It sounds like mystical thing right, two people coming together, this is a Vulcan mind-meld (joke).

But, there is something profound there. -I actually think that it’s less amazing then Ken makes it be. +I actually think that it’s less amazing than Ken makes it be. That is a central point of what I am trying to teach here. I actually had such mind-meld with several of my colleagues after we worked together for a while. For example, it’s literally true that when Dave Musser and I were working together long-distance, @@ -224,14 +117,14 @@

Hello, World

a room of programmers know it’s going to be an empty set. That doesn’t mean that you as an individual person doesn’t know things, but intersection is going to be relatively small.

-

Unfortunatly we’ve got to a point where nobody teaches programming. -Because there’s no professor of computer science who has any idea how to program4. +

Unfortunately we’ve got to a point where nobody teaches programming. +Because there’s no professor of computer science who has any idea how to program6. Part of what I am trying to do here is to start discussing little things like, how do we name variables? Why is it important? We want to get to the point where everybody writes a consistent code, recognizable code. -This is why I’m I want to go slow and develop so that we -all agree5.

+This is why I want to go slow and develop so that we +all agree7.

We have to start with a very very simple program. Most of you recognize it, right? Hello World.

@@ -249,7 +142,7 @@

Hello, World

through the last brace, zero is returned. Because the standard UNIX convention, which became Universal convention, that on success you return zero. -The language actually allows you to do things things like that.

+The language actually allows you to do things like that.

One warning, cout stuff works pretty much like how you think it works. However a friend of mine wrote something like:

@@ -259,7 +152,7 @@

Hello, World

Lo and behold depending on the compiler, different results were printed. The order of evaluation of arguments to a function is not defined -and the operator << is just a function call6. +and the operator << is just a function call8. Never depend on the arguments to a function being evaluated in a given order.

@@ -272,10 +165,10 @@

Number of unique elements

This whole talk about performance started me and some coworkers investigating problems and involved very long queries. When we narrowed things down we always found something -pertaining to incorrect use of STL7. +pertaining to incorrect use of STL9. It’s usually a one liner and the most egregious.

-

The most most amazing thing is the following one liner which I will start in the +

The most amazing thing is the following one liner which I will start in the beginning of the course, because it was just so important. We could weave the whole fabric of the course around this one little example. There was a range of integers. For example:

@@ -284,7 +177,7 @@

Number of unique elements

The person wanted to find out how many of these integers are unique. -The person decided to write the following code8:

+The person decided to write the following code10:

#include <iostream>
 #include <set>
@@ -301,7 +194,7 @@ 

Number of unique elements

Equality vs ordering

-

The algorithm for set 9 is red black tree which is +

The algorithm for set 11 is red black tree which is universally stated in every book to be one of the best implementation of sets. Do we have to believe textbooks? We will see. In any case, we’ll learn to count number of operations. @@ -311,20 +204,20 @@

Equality vs ordering

Well, it has to do sort, but without actually sorting. Otherwise, you really cannot find out how many unique elements are in a sequence of integers. This is a paradoxical thing since finding unique elements is much -easier operation10.

+easier operation12.

It seems that finding unique elements does not require ordering, it just requires equality. But, what we are actually doing is optimizing a search or find. Equality gives us linear search while sorting gives us binary search so we can find much much faster. One of the amazing things which we will discover is that ordering is very important. -Things which we could do with ordering cannot be effectively done just with equality11.

+Things which we could do with ordering cannot be effectively done just with equality13.

-

If we are going by the book we we will say sorting is good as long it does approximately O(n log(n)) operations. +

If we are going by the book we will say sorting is good as long it does approximately O(n log(n)) operations. This is good for the books. It’s actually not good for our code, because Big O could have an arbitrary coefficient. -It could be 50(n log(n))12, +It could be 50(n log(n))14, The effective coefficient, not just in terms of number of operations, but in terms of time for this is very very large. How large? We will be doing measurements together of this very problem. @@ -336,7 +229,7 @@

Correct solution

How should we solve the problem? Could we replace it with a one-liner? There is an STL function just for this. -It is called std::unique13. +It is called std::unique15. What does it do? It goes through a range and removes all non-unique elements, giving back a range of unique elements. @@ -397,15 +290,15 @@

Use the correct STL data structures

If you ask somebody to create a very complex data structure that’s what you’re going to get. You’re going to get problems with node allocation. -You are going to problems with rebalancing. +You are going to get problems with rebalancing. You’re going to get problem with whatever these advanced data structures do. These problems are going to get worse and worse. You’re going to get a cache miss on every step through the set.

-

As our computers become faster, faster, and faster they’re getting slower and -slower and slower14. -Meaning that going to the main memory is very slow. +

As our computers become faster and faster and faster they’re getting slower and +slower and slower16. +Meaning that going to the main memory is very slow. You want to have locality of reference. You want all your data that you are working on in the local cache. If you have a huge set or map it’s not going to be like that.

@@ -441,7 +334,7 @@

Use the correct STL data structures

As long as something gets into a set and it is not erased the pointer, it is fixed. For example, him sitting in this chair is in the set. As long as he’s in this set he will not move from his chair. -You could find him constant time. +You could find him in constant time. It’s a very useful thing except most people do not use set for that.

@@ -485,7 +378,7 @@

Two fundamental performance metrics

This is very good, we have to have that. But then we also need to have times for specific commonly used types. So these are two things which we need to do. -Our goal is to to be able to measure this problem with set and with sort. +Our goal is to be able to measure this problem with set and with sort. But, first we need to learn more about C++.

@@ -500,75 +393,81 @@

Code


  1. -Alex recommended to me Marvin Minsky’s “Computation: Finite and Infinite Machines” +From the 1980 Turing award lecture “The Emperor’s New Clothes”.
  2. +
  3. +Ken’s brilliant algorithm is to generate a finite state machine to recognize a given expression. +See “Regular Expression Matching Can Be Simple And Fast” or Ken’s original paper “Regular Expression Search Algorithm”
  4. +
  5. +Alex recommended to me Marvin Minsky’s “Computation: Finite and Infinite Machines” to learn more about these topics. It is a fantastic book which explores theory of computation, including finite state machines, neural networks, -and Turing machines from a philosophical and mathematical perspective.
  6. -
  7. +and Turing machines from a philosophical and mathematical perspective.
  8. +
  9. I have been unable to find any reference to this quote. Some older fortune files -such as this file from plan-9 contain similar quotes, -such as “Maybe I should have screwed up”
  10. -
  11. -Alex: In my opinion Dennis wasn' at genius like Ken, - but obviously first class.
  12. -
  13. -Alex: At Stanford there’s one guy who knows, but he’s an emeritus (Donald Knuth).
  14. +such as this file from plan-9 contain similar quotes, +such as “Maybe I should have screwed up”
  15. +Alex: In my opinion Dennis wasn’t a genius like Ken, + but obviously first class.
  16. +
  17. +Alex: At Stanford there’s one guy who knows, but he’s an emeritus (Donald Knuth).
  18. +
  19. Alex clarifies in later lectures that he is somewhat convention neutral. Whenever he works with a new group, he wants to reach consensus about style as quickly as possible. The organization and team should follow the same conventions. Clearly some conventions he has an argument for (when to use classes vs structs) but others he say doesn’t matter at all, as long as it’s consistent -(which line to put the brace on).
  20. -
  21. -Alex: << is not some special form (like the notion in Lisp), it is just a function call.
  22. -
  23. +(which line to put the brace on).
  24. +
  25. +Alex: << is not some special form (like the notion in Lisp), it is just a function call.
  26. +
  27. STL (standard template library) was of course originally written by the teacher Alex Stepanov. The original library was written for SGI, but then later integrated into the standard library. However, the version available in most implementations is much more complex and less consistent than Alex’s original work. I have archived the SGI STL code which probably contains more of Alex' original code than other that is available.

    The SGI STL documentation has also been archived and is a very useful reference alongside modern references, -especially for understanding concepts.

  28. -
  29. +especially for understanding concepts.

  30. +
  31. Alex: You didn’t use to need #include <set> when I first did STL. -I had one gigantic a file called stl.h -At that time people said it’s utterly unacceptable to twenty thousand lines of code. +I had one gigantic file called stl.h +At that time people said it’s utterly unacceptable, you have twenty thousand lines of code. If you think about it nowadays it appears to be very funny. Twenty thousand is nothing. That’s why stl.h was replaced with a gazillion little headers. -Sometimes I have no idea what they are.
  32. -
  33. -The set data structure is inspired by mathematical sets, which contain elements (all of which are distinct) in no-particular order.
  34. -
  35. +Sometimes I have no idea what they are.
  36. +
  37. +The set data structure is inspired by mathematical sets, which contain elements (all of which are distinct) in no-particular order.
  38. +
  39. Alex alludes to there being a linear time solution if we have extra memory available. One possible algorithm is using a lookup table. -For example, one could allocate an array with 2^32 entries and count how many times integers are found only once.
  40. -
  41. +For example, one could allocate an array with 2^32 entries and count how many times integers are found only once.
  42. +
  43. Alex: Common Lisp made a similar mistake. Some of you think it was a great language. They carefully designed a bunch of algorithms which work on arbitrary sequences and one of the algorithms was called remove-duplicates and it relied on equality. -It actually would go through N times rremoving things. +It actually would go through N times removing things. They did not sort. They use equality. It did work and it worked exceptionally well for most of the applications. In Lisp you usually have a list with 5 elements so it works just fine. The moment you go to larger lists things stop working. -Quadratic algorithms are slow.
  44. -
  45. +Quadratic algorithms are slow.
  46. +
  47. To say a function f is O(g) it’s only required that f be bounded - by a constant multiple for all inputs beyond some point. -Formally, if f(x) <= Ag(x) for all x > M, where A and M are constant real numbers. -Alex will tend to say “order of” to distinguish O(f) from exactly f.

    +by a constant multiple for all inputs beyond some point. +Formally, if f(x) <= Ag(x) for all x > M, where A and M are constant real numbers +(see “The Art of Computer Programming” 1.2.11).

    -

    Alex is observing that there is no requirement for A to be small.

  48. -
  49. +

    Alex will tend to say “order of” to distinguish O(f) from exactly f. +In this section, Alex is observing that there is no requirement for A to be small.

  50. +
  51. Alex: Ken Thompson changed the spelling of unique (uniq(1)). -My great contribution to computer science was to restore it (joke).
  52. -
  53. +My great contribution to computer science was to restore it (joke).
  54. +
  55. This performance description is based on several trends in computer architecture. The first is that memory, especially main memory hasn’t increased in read/write speed relative to the increase in processor speed. @@ -580,7 +479,7 @@

    Code

    The CPU does a lot of guessing to anticipate what data needs to be in the cache for a given section of code. Data structures and algorithms which jump over the address space tend to mess this up -so the CPU must pause and load data in and out of cache.

  56. +so the CPU must pause and load data in and out of cache.

diff --git a/01_data_structures.md b/01_data_structures.md index 932f740..925bf70 100644 --- a/01_data_structures.md +++ b/01_data_structures.md @@ -4,8 +4,8 @@ ## Reflections on Trusting Trust [Ken Thompson][ken] did many wonderful things. Probably more -than any programmer, he influenced the programming style which we have right -now. While he did not invent C, that was done by his friend [Dennis Ritchie][dennis], +than any programmer, he influenced the programming style which we have right +now. While he did not invent C (that was done by his friend [Dennis Ritchie][dennis]), he invented the programming style which underlies C. Dealing with pointers, knowing how pointers are subtracted, and stuff like that, all comes from Ken Thompson. @@ -13,12 +13,12 @@ knowing how pointers are subtracted, and stuff like that, all comes from Ken Tho Believe it or not, the best and brightest at that time were heavily on the march to get rid of pointers. Absolutely brilliant people who would give [Turing award][turing-award] speeches. -[Tony Hoare][hoare] is the case in point [saying][hoare-speech] that pointers -have to be abolished. +[Tony Hoare][hoare] is the case in point saying that pointers +have to be abolished[^hoare-speech]. The idea was that pointer is illegitimate thing. Language designers were fully convinced that even if you provide pointers nobody should call them pointers. -They would call them something like [access types][access-types]. +They would call them something like [access types][access-types]. They have to prohibit iterations, like you could never subtract pointers. [Niklaus Wirth][wirth], the designer of Pascal, was very certain that you should never allow subtraction between pointers. @@ -29,7 +29,7 @@ But, pointers are still at least partially with us. So Ken is an absolutely great man in many respects. His career started, not with UNIX, but when he was freshly out of -school coming up with the brilliant [practical algorithm][regex] for matching regular expressions. +school coming up with the brilliant practical algorithm for matching regular expressions[^regex-algorithm]. Every time you write [grep][grep] or something like that you're most likely exercising code written by Ken in the late 60s. @@ -40,7 +40,7 @@ So Ken made them practical[^minsky]. Then he did UNIX, this totally brilliant operating system on which we all rely. All of our livelihoods come from Ken, in one shape or form. -Do you remember the [fortune cookie program][fortune]? +Do you remember the [fortune cookie program][fortune]? One of the fortune cookies was: "it is all Ken's fault"[^fortune-quote]. We have to remember that. @@ -50,7 +50,7 @@ Mind it, at that point as far as I know he barely knew the moves. There was total dominance by Russian chess playing program [Kaissa][russia-chess] and two or three years he builds this specialized hardware, -totally revolutionizes the approach to just playing. +totally revolutionizes the approach to just playing. Of course nobody remembers what Kaissa was. He is a guy who could enter in a @@ -68,13 +68,13 @@ years of their collaboration not once they would attempt to do the same thing. They had this good sense to work on complementary things. Except once. Both of them needed to write an assembly program with about -20 lines of code and continues Ken, "I checked it was character by character +20 lines of code and continues Ken, "I checked, it was character by character identical". It sounds like mystical thing right, two people coming together, this is a Vulcan mind-meld (joke). But, there is something profound there. -I actually think that it's less amazing then Ken makes it be. +I actually think that it's less amazing than Ken makes it be. That is a central point of what I am trying to teach here. I actually had such mind-meld with several of my colleagues after we worked together for a while. For example, it's literally true that when [Dave Musser][musser] and I were working together long-distance, @@ -87,24 +87,27 @@ not be a miracle. We should be writing basic code, character by character, identical. Imagine how wonderful it would be if you could understand what someone else wrote. -[^minsky]: Alex recommended to me Marvin Minsky's ["Computation: Finite and Infinite Machines"](minsky-computation) +[^minsky]: Alex recommended to me Marvin Minsky's ["Computation: Finite and Infinite Machines"][minsky-computation] to learn more about these topics. It is a fantastic book which explores theory of computation, including finite state machines, neural networks, and Turing machines from a philosophical and mathematical perspective. [^fortune-quote]: I have been unable to find any reference to this quote. Some older fortune files - such as [this file][plan-9-fortune] from plan-9 contain similar quotes, + such as [this file][plan-9-fortune] from plan-9 contain similar quotes, such as "Maybe I should have screwed up" -[^dennis]: Alex: In my opinion Dennis wasn' at genius like Ken, +[^dennis]: Alex: In my opinion Dennis wasn't a genius like Ken, but obviously first class. [^teaching-programming]: Alex: At Stanford there's one guy who knows, but he's an emeritus (Donald Knuth). -[fortune]: https://en.wikipedia.org/wiki/Fortune_(Unix) +[^hoare-speech]: From the 1980 Turing award lecture ["The Emperor's New Clothes"][hoare-speech]. +[^regex-algorithm]: Ken's brilliant algorithm is to generate a finite state machine to recognize a given expression. + See ["Regular Expression Matching Can Be Simple And Fast"][regex-fast] or Ken's original paper ["Regular Expression Search Algorithm"][regex-paper] +[fortune]: https://en.wikipedia.org/wiki/Fortune_(Unix) [dennis]: https://en.wikipedia.org/wiki/Dennis_Ritchie [kleene]: https://en.wikipedia.org/wiki/Stephen_Cole_Kleene [russia-chess]: https://en.wikipedia.org/wiki/Kaissa @@ -116,14 +119,14 @@ Imagine how wonderful it would be if you could understand what someone else wrot [grep]: https://linux.die.net/man/1/grep [plan-9-fortune]: http://fortunes.cat-v.org/plan_9/ -[regex]: https://swtch.com/~rsc/regexp/regexp1.html +[regex-fast]: https://swtch.com/~rsc/regexp/regexp1.html [turing-award]: https://en.wikipedia.org/wiki/Turing_Award [hoare]: https://en.wikipedia.org/wiki/Tony_Hoare [hoare-speech]: https://www.cs.fsu.edu/~engelen/courses/COP4610/hoare.pdf [wirth]: https://en.wikipedia.org/wiki/Niklaus_Wirth - [access-types]: http://goanna.cs.rmit.edu.au/dale/ada/aln/13_access_types.html +[regex-paper]: papers/regular-expressions.pdf ## Hello, World @@ -131,13 +134,13 @@ Whomever really wants to learn, will learn, and that is a challenge because it i a room of programmers know it's going to be an empty set. That doesn't mean that you as an individual person doesn't know things, but intersection is going to be relatively small. -Unfortunatly we've got to a point where nobody teaches programming. +Unfortunately we've got to a point where nobody teaches programming. Because there's no professor of computer science who has any idea how to program[^teaching-programming]. Part of what I am trying to do here is to start discussing little things like, how do we name variables? Why is it important? We want to get to the point where everybody writes a consistent code, recognizable code. -This is why I'm I want to go slow and develop so that we +This is why I want to go slow and develop so that we all agree[^conventions]. We have to start with a very very simple program. Most of you recognize it, right? [Hello World][hello-world]. @@ -155,7 +158,7 @@ C++ treats `main` as a special function; meaning when the control goes through the last brace, zero is returned. Because the standard UNIX convention, which became Universal convention, that on success you return zero. -The language actually allows you to do things things like that. +The language actually allows you to do things like that. One warning, [`cout`][cout] stuff works pretty much like how you think it works. However a friend of mine wrote something like: @@ -205,7 +208,7 @@ When we narrowed things down we always found something pertaining to incorrect use of STL[^about-stl]. It's usually a one liner and the most egregious. -The most most amazing thing is the following one liner which I will start in the +The most amazing thing is the following one liner which I will start in the beginning of the course, because it was just so important. We could weave the whole fabric of the course around this one little example. There was a range of integers. For example: @@ -247,7 +250,7 @@ Equality gives us linear search while sorting gives us binary search so we can f One of the amazing things which we will discover is that ordering is very important. Things which we could do with ordering cannot be effectively done just with equality[^cl-mistake]. -If we are going by the book we we will say sorting is good as long it does approximately `O(n log(n))` operations. +If we are going by the book we will say sorting is good as long it does approximately `O(n log(n))` operations. This is good for the books. It's actually not good for our code, because Big O could have an arbitrary coefficient. @@ -264,17 +267,17 @@ That is a lot. One possible algorithm is using a lookup table. For example, one could allocate an array with `2^32` entries and count how many times integers are found only once. +[^big-oh]: To say a function `f` is `O(g)` it's only required that `f` be bounded + by a constant multiple for all inputs beyond some point. + Formally, if `f(x) <= Ag(x)` for all `x > M`, where `A` and `M` are constant real numbers + (see "The Art of Computer Programming" 1.2.11). -[^big-oh]: To say a function `f` is `O(g)` it's only required that `f` be bounded - by a constant multiple for all inputs beyond some point. - Formally, if `f(x) <= Ag(x)` for all `x > M`, where `A` and `M` are constant real numbers. Alex will tend to say "order of" to distinguish `O(f)` from exactly `f`. - - Alex is observing that there is no requirement for `A` to be small. + In this section, Alex is observing that there is no requirement for `A` to be small. [^headers]: Alex: You didn't use to need `#include ` when I first did STL. - I had one gigantic a file called `stl.h` - At that time people said it's utterly unacceptable to twenty thousand lines of code. + I had one gigantic file called `stl.h` + At that time people said it's utterly unacceptable, you have twenty thousand lines of code. If you think about it nowadays it appears to be very funny. Twenty thousand is nothing. That's why `stl.h` was replaced with a gazillion little headers. Sometimes I have no idea what they are. @@ -284,7 +287,7 @@ That is a lot. Some of you think it was a great language. They carefully designed a bunch of algorithms which work on arbitrary sequences and one of the algorithms was called [`remove-duplicates`][clhs-duplicates] and it relied on equality. - It actually would go through `N` times rremoving things. + It actually would go through `N` times removing things. They did not sort. They use equality. It did work and it worked exceptionally well for most of the applications. @@ -360,15 +363,15 @@ Miracles are not possible. If you ask somebody to create a very complex data structure that's what you're going to get. You're going to get problems with node allocation. -You are going to problems with rebalancing. +You are going to get problems with rebalancing. You're going to get problem with whatever these advanced data structures do. These problems are going to get worse and worse. You're going to get a [cache miss][cache] on every step through the set. -As our computers become faster, faster, and faster they're getting slower and +As our computers become faster and faster and faster they're getting slower and slower and slower[^slower]. -Meaning that going to the main memory is very slow. +Meaning that going to the main memory is very slow. You want to have locality of reference. You want all your data that you are working on in the local cache. If you have a huge set or map it's not going to be like that. @@ -402,7 +405,7 @@ You need a `set` only if you need a thing which does not move things around. As long as something gets into a `set` and it is not erased the pointer, it is fixed. For example, him sitting in this chair is in the `set`. As long as he's in this set he will not move from his chair. -You could find him constant time. +You could find him in constant time. It's a very useful thing except most people do not use set for that. [^slower]: This performance description is based on several trends in computer architecture. @@ -460,7 +463,7 @@ specific type. This is very good, we have to have that. But then we also need to have times for specific commonly used types. So these are two things which we need to do. -Our goal is to to be able to measure this problem with `set` and with sort. +Our goal is to be able to measure this problem with `set` and with sort. But, first we need to learn more about C++. [red-black]: https://en.wikipedia.org/wiki/Red%E2%80%93black_tree diff --git a/02_regular_types.html b/02_regular_types.html index c0f4b1b..7d602a0 100644 --- a/02_regular_types.html +++ b/02_regular_types.html @@ -2,116 +2,9 @@ + 2. Regular types and other concepts - + @@ -156,7 +49,7 @@

We don’t know how to program yet

I did interview at Facebook and let me tell you what I saw of that place… That was five years ago. Maybe they improved but I don’t think so. -Places tend to to decline.

+Places tend to decline.

You might say: “Who are you to tell us?” I am as guilty as any of you. @@ -196,7 +89,7 @@

The motivation for concepts

but actually there are requirements on the type, certain properties which are required for these containers to function.

-

Let, us think about what my task was when I started working on the C++ STL. +

Let us think about what my task was when I started working on the C++ STL. It was to define standard data structures which will work for any reasonable subset of types. What is a reasonable subset of types? This is a little bit tricky but it’s of paramount importance.

@@ -210,7 +103,7 @@

The motivation for concepts

It has to work for pointer.

To understand all this, we’re going to become a little bit theoretical. -None of this stuff actually works unless you understand at least a +None of this stuff actually works unless you understand at least a little bit of theory. We will call such reasonable types Regular. What we will do is formally define a set of operators that all Regular types must have, @@ -329,7 +222,7 @@

Equality operator

!= should always behave like: !(a == b). My very strong point is that the semantics of inequality (!=) is absolutely and totally bound to the semantics of equality (==). -You should not even be able to have a situation where they have different semantics. +You should not even be able to have a situation where they have different semantics. But, the standards committee disagrees with me on that. They say that you could have equality be equality and inequality be multiplication operator. @@ -361,7 +254,7 @@

Less than operator

< must obey the following mathematical properties:

-

Axiom 1: Anti-reflexive: !(a < a)

+

Axiom 1: Anti-reflexive: !(a < a).

Axiom 2: Transitive: If a < b and b < c then a < c.

@@ -371,7 +264,7 @@

Less than operator

In the same way that the semantics of == is related to !=, we can see the semantics of < must be totally bound to the semantics of equality. -But furthermore, < is related to four other operations which must also be defined:

+But furthermore, < is related to three other operations which must also be defined:

  1. < less than
  2. @@ -382,7 +275,7 @@

    Less than operator

    If you provide them then they have to have natural meaning. -For example, !(a < b) should be a >= b otherwise the world +For example, !(a < b) should be a >= b otherwise the world perishes.

    Later we will talk more about orderings, and define several different @@ -420,7 +313,7 @@

    Less than operator

    Containers of Regular types should be Regular.
  3. This is likely a reference to how std::shared_ptr behaves -which provides a form of automatic memory management.,
  4. +which provides a form of automatic memory management.
  5. Bjarne Stroustrup is the creator of C++, author of many C++ books, and always has been an active member of the community.
  6. diff --git a/02_regular_types.md b/02_regular_types.md index ceb0c70..5393f23 100644 --- a/02_regular_types.md +++ b/02_regular_types.md @@ -26,7 +26,7 @@ So, I'm not even going to mention Yahoo (joke). I did interview at Facebook and let me tell you what I saw of that place... That was five years ago. Maybe they improved but I don't think so. -Places tend to to decline. +Places tend to decline. You might say: "Who are you to tell us?" I am as guilty as any of you. @@ -79,7 +79,7 @@ You might think you can just put any type in a map, but actually there are requirements on the type, certain properties which are required for these containers to function. -Let, us think about what my task was when I started working on the C++ STL. +Let us think about what my task was when I started working on the C++ STL. It was to define standard data structures which will work for any reasonable subset of types. What is a reasonable subset of types? This is a little bit tricky but it's of paramount importance. @@ -93,7 +93,7 @@ even more important than `int`. It has to work for pointer. To understand all this, we're going to become a little bit theoretical. -None of this stuff actually works unless you understand at least a +None of this stuff actually works unless you understand at least a little bit of theory. We will call such reasonable types `Regular`. What we will do is formally define a set of operators that all `Regular` types must have, @@ -221,7 +221,7 @@ a copy, the original and the copy are equal. `!=` should always behave like: `!(a == b)`. My very strong point is that the semantics of inequality (`!=`) is absolutely and totally bound to the semantics of equality (`==`). -You should not even be able to have a situation where they have different semantics. +You should not even be able to have a situation where they have different semantics. But, the standards committee disagrees with me on that. They say that you could have equality be equality and inequality be multiplication operator. @@ -237,7 +237,7 @@ So even the paradigm of a member is the wrong paradigm. They are symmetric. [^smart-pointers]: This is likely a reference to how [`std::shared_ptr`][cpp-shared-ptr] behaves - which provides a form of automatic memory management., + which provides a form of automatic memory management. ## Total orderings @@ -255,17 +255,17 @@ to `TotallyOrdered`. `<` must obey the following [mathematical properties][total-order]: -**Axiom 1:** Anti-reflexive: `!(a < a)` +**Axiom 1:** Anti-reflexive: `!(a < a)`. **Axiom 2:** Transitive: If `a < b` and `b < c` then `a < c`. **Axiom 3:** Anti-symmetric: If `a < b` then `!(b < a)`. -**Axiom 4:** If `a != b` then `a < b` or ` b > a`. +**Axiom 4:** If `a != b` then `a < b` or `b > a`. In the same way that the semantics of `==` is related to `!=`, we can see the semantics of `<` must be totally bound to the semantics of equality. -But furthermore, `<` is related to four other operations which must also be defined: +But furthermore, `<` is related to three other operations which must also be defined: 1. `<` less than 2. `>` greater than @@ -273,7 +273,7 @@ But furthermore, `<` is related to four other operations which must also be def 4. `>=` greater than or equal to If you provide them then they have to have natural meaning. -For example, `!(a < b)` should be `a >= b` otherwise the world +For example, `!(a < b)` should be `a >= b` otherwise the world perishes. Later we will talk more about orderings, and define several different diff --git a/03_singleton.html b/03_singleton.html index a7e29b4..7ad3607 100644 --- a/03_singleton.html +++ b/03_singleton.html @@ -2,116 +2,9 @@ + 3. Singleton: a pattern for regular types - + @@ -134,7 +27,7 @@

    3. Singleton: a pattern for regular types

    Learning C++ isn’t as hard as it appears

    I talked to someone and he said “I have been programming in C++ for about eight years, but I never learned C++”. - That is a very common thing. +That is a very common thing. He was just honestly admitting something which happens to be the case for most people. Most of us never learn programming languages. I never learned C++ I was just very @@ -196,7 +89,7 @@

    Template and type functions

    another type. Let’s forget C++ for a second. Mathematically speaking we want to write a type function: a thing which takes a type in -and generates new types. +and generates new types. The template is the mechanism in C++ for doing just that.

    There are of course other type functions, even type functions in C. @@ -254,7 +147,7 @@

    Guidelines for writing classes

    Another way to think about it is to help preserve an invariant. There is some condition we need to assure that the values cannot change.

    -

    For singleton, we pick struct because it’s less typing. +

    For singleton, we pick struct because it’s less typing. We should find the most minimal way of saying what we want to say.

    @@ -365,7 +258,7 @@

    Virtual functions, virtual destructors, and OOP

    He tells you always declare destructor as virtual7. OK, he’s wrong. Simple as that.

    -

    What we do we want to create? Take type T, put it in a struct. Will the size of the thing increase? +

    What do we want to create? Take type T, put it in a struct. Will the size of the thing increase? No. It has no overhead. Singleton is the same size as T. It’s the wonderful thing about singleton, and lets you pack them in arrays. @@ -381,10 +274,10 @@

    Virtual functions, virtual destructors, and OOP

    If you program in an object-oriented way then many good things might happen. I don’t know what they are but you are not going to be efficient. Bjarne used to joke that object -oriented systems are those that slow graphics.

    +oriented systems are those that have slow graphics.

    As time progresses, ++ is getting faster and faster and virtual function is getting slower and slower. -Their spread is growing and we’re not going to address +Their spread is growing and we’re not going to address any parts of C++ in this course which slow the computations.

    If you want to learn about virtual I’m sure there are lots @@ -429,8 +322,9 @@

    Equality and the three laws of thought

    The law of identity: a == a. Popeye the Sailor used to say, “I am, what I am”.

    Computers do not obey such law. -There is a fundamental case which breaks and has consequences. -Exercise: If you don’t believe me, try to figure out a type which violates the law of identity.

    +There is a fundamental case which breaks and has consequences.

    + +

    Exercise: If you don’t believe me, try to figure out a type which violates the law of identity (solved in chapter 6).

    The law of non-contradiction: You cannot have a predicate P be true and !P be true at the same time.

    @@ -455,7 +349,7 @@

    Why can’t the compiler generate == and !=?

    Then they fixed it. They said, “we’ll just copy the bits” and things would work.

    -

    Equality is defined for all built-in types; int, double, short, char, pointers, etc. +

    Equality is defined for all built-in types: int, double, short, char, pointers, etc. So, if you have a struct, why can’t you generate an equality that uses those? Two things are equal if all their members are equal. It’s a sensible rule a compiler could do.

    @@ -463,7 +357,7 @@

    Why can’t the compiler generate == and !=?

    I never managed to ask Dennis Ritchie about it. But, I asked probably the second best person on the subject, Steve Johnson, because he actually implemented all these assignments for structures and things like that. -Steve told me, “it was very hard because the bits in the padding might not be equal. +Steve told me, “it was very hard because the bits in the padding might not be equal. If you compare bit by bit things which have equal members, but not equal bits in the padding, it will not work”10.

    But, why should you compare bit by bit? @@ -540,7 +434,7 @@

    Totally ordered singleton

    Understand how all the other operators are defined in terms of <. Contemplate it. It’s really important. -It is mathematics, but that’s mathematics every programmer should be able to do .

    +It is mathematics, but that’s mathematics every programmer should be able to do.

    Specifying concepts

    @@ -593,8 +487,7 @@

    Implicit type conversion

    Were they mad? It was because they were lazy. They couldn’t do it elegantly because they didn’t -have -function overloading.

    +have function overloading.

    In C++ we have sqrt(int), sqrt(double), etc You can pass int, or double and it does the right thing. @@ -647,10 +540,10 @@

    Implicit type conversion

    std::cin is convertible to a pointer. If it fails, it returns null with type void*. -Since it is convertible to a pointer, you can apply one +Since it is convertible to a pointer, you can apply one more conversion and convert this pointer to a Boolean. Then you could convert it to an integer. -So std::cin becomes zero and you shifted by forty two positions. +So std::cin becomes zero and you shifted by forty-two positions. Isn’t that beautiful (joke)?

    The problem is they fixed this problem by @@ -661,8 +554,9 @@

    Implicit type conversion

    Explicit conversions are not going to be called implicitly, unless they are in while, if, and other conditions which people used for input streams. So, the entire type system is screwed up (technical term), to work around some +ancient design feature. The end of the story is avoid implicit conversions. -Never rely on them it’s impossible to avoid them. +Never rely on them, it’s impossible to avoid them. Even if you declare everything explicit there is still a context in C++ where implicit conversion will be done. You should never rely on one type @@ -787,7 +681,7 @@

    Code

    Concepts as a language feature went through many iterations and delays before finally being included in the C++20 standard. When the course was given, a group at A9 (including Alex) was working to get them included in C++14. -You can read their proposal. +You can read their proposal. Bjarne actually visits A9 to give a guest lecture on concepts as part of the course, however this is not included in the notes as it is a departure from the rest of the material.
diff --git a/03_singleton.md b/03_singleton.md index 687da5c..290b2ba 100644 --- a/03_singleton.md +++ b/03_singleton.md @@ -4,7 +4,7 @@ ## Learning C++ isn't as hard as it appears I talked to someone and he said "I have been programming in C++ for about eight years, but I never learned C++". - That is a very common thing. +That is a very common thing. He was just honestly admitting something which happens to be the case for most people. Most of us never learn programming languages. I never learned C++ I was just very @@ -71,7 +71,7 @@ Why do I need template? Because we want to write something which takes one type another type. Let's forget C++ for a second. Mathematically speaking we want to write a type function: a thing which takes a type in -and generates new types. +and generates new types. The `template` is the mechanism in C++ for doing just that. There are of course other type functions, even type functions in C. @@ -135,7 +135,7 @@ There is no reason to write to it before you first deallocate. Another way to think about it is to help preserve an invariant. There is some condition we need to assure that the values cannot change. -For `singleton`, we pick struct because it's less typing. +For `singleton`, we pick `struct` because it's less typing. *We should find the most minimal way of saying what we want to say*. [^communist]: Alex: Don't be so eager with making members of classes private. @@ -264,7 +264,7 @@ people think invented C++ are idealized in books like "Effective C++", He tells you always declare destructor as virtual[^scott-virtual]. OK, he's wrong. Simple as that. -What we do we want to create? Take type `T`, put it in a struct. Will the size of the thing increase? +What do we want to create? Take type `T`, put it in a struct. Will the size of the thing increase? No. It has no overhead. Singleton is the same size as `T`. It's the wonderful thing about singleton, and lets you pack them in arrays. @@ -280,10 +280,10 @@ I am showing you how you could program differently. If you program in an object-oriented way then many good things might happen. I don't know what they are but you are not going to be efficient. Bjarne used to joke that object -oriented systems are those that slow graphics. +oriented systems are those that have slow graphics. As time progresses, `++` is getting faster and faster and virtual function is getting slower and slower. -Their spread is growing and we're not going to address +Their spread is growing and we're not going to address any parts of C++ in this course which slow the computations. If you want to learn about virtual I'm sure there are lots @@ -344,7 +344,8 @@ and goes with two other fundamental laws of thought [^laws-of-thought]: Computers do not obey such law. There is a fundamental case which breaks and has consequences. -**Exercise:** If you don't believe me, try to figure out a type which violates the law of identity. + +**Exercise:** If you don't believe me, try to figure out a type which violates the law of identity (solved in chapter 6). **The law of non-contradiction**: You cannot have a predicate `P` be true and `!P` be true at the same time. @@ -394,8 +395,7 @@ They were not copyable and you couldn't pass them as arguments. Then they fixed it. They said, "we'll just copy the bits" and things would work. - -Equality is defined for all built-in types; `int`, `double`, `short`, `char`, pointers, etc. +Equality is defined for all built-in types: `int`, `double`, `short`, `char`, pointers, etc. So, if you have a struct, why can't you generate an equality that uses those? Two things are equal if all their members are equal. It's a sensible rule a compiler could do. @@ -403,7 +403,7 @@ It's a sensible rule a compiler could do. I never managed to ask Dennis Ritchie about it. But, I asked probably the second best person on the subject, [Steve Johnson][johnson], because he actually implemented all these assignments for structures and things like that. -Steve told me, "it was very hard because the bits in the padding might not be equal. +Steve told me, "it was very hard because the bits in the padding might not be equal. If you compare bit by bit things which have equal members, but not equal bits in the padding, it will not work"[^padding]. But, why should you compare bit by bit? @@ -505,13 +505,12 @@ but expect nothing from them. Understand how all the other operators are defined in terms of `<`. Contemplate it. It's really important. -It is mathematics, but that's mathematics every programmer should be able to do . +It is mathematics, but that's mathematics every programmer should be able to do. [ip-quote]: https://en.wikipedia.org/wiki/Robustness_principle ### Specifying concepts - Now let's talk about what kind of type `T` could be. Because we implemented all these operators, `T` could be `SemiRegular`, `Regular`, or `TotallyOrdered`. @@ -528,7 +527,6 @@ In `singleton` we add a comment to describe this: It's a good example of a **disjunctive concept**. `T` could be any of them. - You might wonder how `==` will work, if you plug-in only a type `T` which is only `SemiRegular`. In C++, things don't have to be defined unless they are used. @@ -544,7 +542,7 @@ Same for total ordering, etc. [^concepts-proposal]: [Concepts][cpp-concepts] as a language feature went through many iterations and delays before finally being included in the C++20 standard. When the course was given, a group at A9 (including Alex) was working to get them included in C++14. - You can read their [proposal](papers/concepts_proposal.pdf). + You can read their [proposal](papers/concepts-proposal.pdf). Bjarne actually visits A9 to give a guest lecture on concepts as part of the course, however this is not included in the notes as it is a departure from the rest of the material. @@ -568,8 +566,7 @@ Why did these Bell Labs guys introduce such a thing? Were they mad? It was because they were lazy. They couldn't do it elegantly because they didn't -have -[function overloading][overloading]. +have [function overloading][overloading]. In C++ we have `sqrt(int)`, `sqrt(double)`, etc You can pass `int`, or `double` and it does the right thing. @@ -619,10 +616,10 @@ But, it did. `std::cin` is convertible to a pointer. If it fails, it returns `null` with type `void*`. -Since it is convertible to a pointer, you can apply one +Since it is convertible to a pointer, you can apply one more conversion and convert this pointer to a Boolean. Then you could convert it to an integer. -So `std::cin` becomes zero and you shifted by forty two positions. +So `std::cin` becomes zero and you shifted by forty-two positions. Isn't that beautiful (joke)? The problem is they fixed this problem by @@ -633,8 +630,9 @@ So they had to break the rule. Explicit conversions are not going to be called implicitly, unless they are in `while`, `if`, and other conditions which people used for input streams. So, the entire type system is screwed up (technical term), to work around some +ancient design feature. The end of the story is avoid implicit conversions. -Never rely on them it's impossible to avoid them. +Never rely on them, it's impossible to avoid them. Even if you declare everything explicit there is still a context in C++ where implicit conversion will be done. You should never rely on one type @@ -655,5 +653,3 @@ is false. Of course it handles implicit conversions, what else could it do? [overloading]: https://en.cppreference.com/w/cpp/language/overload_resolution [stat]: https://linux.die.net/man/2/stat [implicit-rules]: https://en.cppreference.com/w/cpp/language/implicit_conversion - - diff --git a/04_instrumented.html b/04_instrumented.html index 8cf022c..1e463e4 100644 --- a/04_instrumented.html +++ b/04_instrumented.html @@ -2,116 +2,9 @@ + 4. Instrumented: a performance measuring tool - + @@ -136,7 +29,7 @@

Great language designers

Here is a little speech about great language designers. Once upon a time there was a very great language designer called Niklaus Wirth. -He was very brilliant, no question about it, and a very wonderful, kind, charming, and witty, person. +He was very brilliant, no question about it, and a very wonderful, kind, charming and witty person. So he designed his first programming language called Euler which he did at Berkeley. Then it was ALGOL W. @@ -151,19 +44,19 @@

Great language designers

and design a better language. What about the customer base? Well, who cares. This is why we still have so many people programming in -Oberson. Because, by the time he got to Oberon, people got tired.

+Oberon. Because, by the time he got to Oberon, people got tired.

This is why I claim, that to be honest Bjarne is literally the greatest language designer, at least after John Backus. John Backus invented the first useful programming language Fortran. It is still a very useful programming language. Then came C and C++. -Dennis was brilliant. +Dennis was brilliant. He did C and then washed his hands and walked away, which is very wise. I sympathize. -But, Bjarne started working on C++ -in roughly 1978 (and then released in 1981), 35 years ago . +But Bjarne started working on C++ +in roughly 1978 (and then released in 1981), 35 years ago. Then he never abandoned us. It was never perfect, but he would work on it, and work on it, and work on it, and go to horrible meetings of the standard committee, @@ -175,28 +68,28 @@

Great language designers

mechanisms known to humankind. What you can do right now in C++, you cannot really do in any other language. -But, it requires patience, determination, +But it requires patience, determination and genius. Whatever decisions he made in 1979 didn’t lead to a stalemate later on. There is some ugly stuff, but you could avoid it. Being able to evolve the language for that long is incredible. I have no other example, not just in language design, but in computer science. -Ken Thompson did Unix but do you think he stayed +Ken Thompson did UNIX but do you think he stayed with UNIX? -No, in his tuning award speech he said he stopped working -on UNIX a long time. +No, in his Turing award speech he said he stopped working +on UNIX a long time ago. It’s very difficult.

I’m a clear example of a lazy bum. -STL was voted in August 1994, 20 years ago. +STL was voted in August 1994, 20 years ago. How many times did I attend standard committee meetings after that? None. -How many times did I look at proposals related to STL did I do anything related to this? +How many times did I look at proposals related to STL? Did I do anything related to this? Nothing. This is why I have sanity but also this is -why compared with Bjarne, I am a failure. +why, compared with Bjarne, I am a failure. I let people do things with STL that should have been prevented. I did not evolve it. I did not grow it. @@ -205,12 +98,13 @@

Great language designers

My advice to most of you, if you want good life follow my example. Because it’s very hard to do what Bjarne does. -I cannot point a single other example literally of a person who keeps +I cannot point to a single other example literally of a person who keeps working. McCarthy invents Lisp. After 1.5 he’s gone. He didn’t follow. -He didn’t contribute, and so on.

+He didn’t contribute, and so on. +Everybody does that, because we are weaklings.

So C++ is a great accomplishment, but of course it has flaws. @@ -261,7 +155,7 @@

Instrumented class

Redefining regular operations with counting

-

We’re going to write instrumented using the same technique +

We’re going to write instrumented using the same technique we use to write all classes:

    @@ -272,7 +166,7 @@

    Redefining regular operations with counting

    Now we will do some work to count operations. In the copy constructor, we will initialize value, -and add a line to +and add a line that bumps up the copy count, like this:

    instrumented(const instrumented& x) : value(x.value) {
    @@ -298,7 +192,7 @@ 

    Redefining regular operations with counting

    This line is short, so I like one. I have been changing my programming style depending on the people with whom I work. -Paul McJones effected my programming style greatly when I started here. +Paul McJones affected my programming style greatly when I started here. For example I never used to use x. I avoided short variable names. In my old code everything is called special_variable_x. @@ -308,7 +202,7 @@

    Redefining regular operations with counting

    on singleton:

    ~instrumented() { ++counts[destructor]; }
    -instrumented& operator=(const instrumented& x) {  
    +instrumented& operator=(const instrumented& x) {
       ++counts[assignment];
       value = x.value;
       return *this;
    @@ -321,7 +215,7 @@ 

    Redefining regular operations with counting

    } // TotallyOrdered friend -bool operator<(const instrumented& x, const instrumented& y) { +bool operator<(const instrumented& x, const instrumented& y) { ++counts[comparison]; return x.value < y.value; } @@ -332,14 +226,14 @@

    Redefining regular operations with counting

    Storing counts

    -

    What do with all the counts? +

    What to do with all the counts? Where do they get stored? -Every time this instrumented thing happens we want some global count to be incremented. +Every time this instrumented thing happens we want some global count to be incremented. We were told that using global variables is bad. If I were doing it just for me, I would have used globals. Old guys don’t mind using global variables. They’re actually good. -Since you are modern people, we will show you how to do it to do with inheritance. +Since you are modern people, we will show you how to do it with inheritance. We will define a base class to hold this data:

    struct instrumented_base
    @@ -357,6 +251,10 @@ 

    Storing counts

    };
    +

    This is a remarkable example of a class containing nothing. +It is a very useful thing, we will use very many such classes. +It’s very cheap to pass things which contain nothing.

    +

    A static member is a member which is one per class, not one per instance, and they’re useful because we don’t want to keep count per instance. We want to keep count per class. The static members need to be initialized in a .cpp file. @@ -381,20 +279,18 @@

    Storing counts

    Inheritance is very useful when you inherit from a class containing nothing because it couldn’t do any harm. That’s what we’re going to do here.

    -
    struct instrumented : instrumented_base
    +
    template <typename T>
    +// T is Semiregular or Regular or TotallyOrdered
    +struct instrumented : instrumented_base
     
    -

    This is a remarkable example of a class containing nothing. -It is a very useful thing, we will use very many such classes. -It’s very cheap to pass things which contain nothing.

    -

    There is a notorious problem in C++ with static members of templates, it’s just not good. -We don’t need to inherit from a template all these different +We don’t need to inherit from a template. All these different instrumented<T>’s inherit from the same base which will contain nothing at all and we will use this as a counting device.

    -

    What is good about is we managed not to +

    What is good about this is we managed not to muck up this nice class. It’s basically the same as singleton. It’s fundamentally of the same structure and we pushed all of the @@ -403,7 +299,7 @@

    Storing counts

    How should we use enum?

    -

    enum which is a mechanism which introduces a bunch of constants. +

    enum is a mechanism which introduces a bunch of constants. It’s a very evil mechanism. I was wondering who invented enum, because it wasn’t in first edition of K&R1. @@ -416,8 +312,8 @@

    How should we use enum?

    Whether it worked there correctly or not remains to be seen.

    Dennis decided to bring it in, -but the issue is that it’s not really a type. - C++ attempts make it a type but doesn’t quite work. +but the issue is that it’s not really a type. +C++ attempts to make it a type but it doesn’t quite work. You could have a variable typed with the enum which has three different values and then you take totally different value assigned to it, nothing happens. @@ -434,18 +330,18 @@

    Use all the language features

    In general, I use any language feature when appropriate. Paul and I even use goto and we’re not ashamed. There is a famous statement attributed to Ken Thompson that the fastest way of going from one place in the program to another is by using the goto statement, and it is so. -If you implement things like called state machines it’s a +If you implement things like state machines it’s a wonderful technique, because you have transitions. You go from this state to that state. You could write a loop with some conditional. Or, you could just goto and write very beautiful code, at least we believe so. -Everything has its place, Dijkstra’s structures not withstanding2.

    +Everything has its place, Dijkstra’s strictures not withstanding2.

    Using instrumented to analyze sort

    -

    To learn how to use instrumented, let’s analyze the performance +

    To learn how to use instrumented, let’s analyze the performance of sorting routines in STL. There are a few of them and they all use a distinct algorithm:

    @@ -466,29 +362,36 @@

    Using instrumented to analyze sort

    To be stable you use merge sort.

  1. std::partial_sort

    -

    For example, you give it 100 elements and you sort from 1st to the - 10th to the last. But, for the last 90 elements there is no guarantee. +

    What’s the interface? + It takes three iterators first, middle, and last. + What does it do? + It orders the elements so that first to middle + contains the smallest elements from the entire range, in sorted order.

    + +

    For example, suppose you give it 100 elements and you want to sort from 1st, to the 10th, to the last. + The smallest 10 will be moved to the front in sorted order. + But the last 90 elements will be left in some unspecified order. Those of you who work on search, know you don’t really need to sort everything, - you just need to sort a little bit.

    + you just need to sort a little bit3.

    -

    What do you use for partial sort? +

    What algorithm do you use for partial sort? I’ll tell you that it’s wrong. - The solution which STL uses was good in 1994, but a bad solution in 2013. + The solution which STL uses was good in 1993, but a bad solution in 2013. It uses heap sort. - That’s what algorithm books tell you3.

  2. + That’s what algorithm books tell you and what I believed was the correct solution4.

    We want to compare how these various sort operations perform, relative to each other.

    -

    Exercise: With instrumented, compare the number of operations -between these three kinds of sort4. +

    Exercise: With instrumented, compare the number of operations +between these three kinds of sort5. Refer to the code provided at the end of the chapter. A complete test harness is provided which will randomly shuffle a large list of numbers to test with and print the results in a formatted table. -Here is a sample of the output for heap sort5:

    +Here is a sample of the output for heap sort6:

           n        copy      assign    destruct     default       equal        less    construct
           16          69          91          69           0           0          65            0
    @@ -519,10 +422,10 @@ 

    Normalizing data

    Another useful way to study operation counts is by normalizing the data. We know the asymptotic complexity of sort algorithms should be O(n log(n)). -So, what we can do is normalize the data +So, what we can do is normalize the data to tell us for n elements, how many operations were done, per n log(n).

    -

    Here is an example of such a normalizing functions.

    +

    Here is an example of such a normalizing function:

    double normalized_by_nlogn(double x, double n) { 
       return x / (n * (log(n) / log(2))); 
    @@ -561,7 +464,7 @@ 

    Normalizing data

    16777216 0.19 1.06 0.19 0.00 0.00 1.01 0.00
    -

    You remember Knuth (Author of “The Art of Computer programming”)? +

    You remember Knuth (Author of “The Art of Computer Programming”)? In the beginning of the first volume when he introduces complexity he tells you how to measure complexity. He says we measure it as a function where we have @@ -580,7 +483,7 @@

    What data should we test on?

    Let us talk about possible input shapes. What is a good set of data to test these algorithms on? The most basic one is just to generate uniformly random data. -Another shape to try is a list which is already sort. +Another shape to try is a list which is already sorted. As we’ll discover later on, some sorting algorithms are particularly bad for this particular configuration. Both ascending and descending will give different results. @@ -592,7 +495,7 @@

    What data should we test on?

    to unequal elements.

    Random shuffle of uniform shuffle of random data is very good, -but it’s not a very realistic a distribution. +but it’s not a very realistic distribution. One which is very common in real life is called Zipf distribution. Let me describe it incorrectly, first. Assume that the most probable guy comes with probability 1. @@ -605,9 +508,9 @@

    What data should we test on?

    The denominator is actually ln(n) + gamma -where gamma is a small number6.

    +where gamma is a small number7.

    -

    Exercise: Introduce variation into the shape of data and compare +

    Exercise: Introduce variation into the shape of data and compare the sorting algorithms again.

    @@ -615,7 +518,7 @@

    Measuring solution to unique elements

    Counting operations is only one measure of performance. If we apply instrumented to our problem of finding unique elements -in the first chapter, we will actually find that using +in the first chapter, we will find that using std::set actually uses fewer of almost every operation than first sorting with std::sort and then calling std::unique. @@ -686,11 +589,11 @@

    Code

  3. K&R (Kernighan and Ritchie) is a nickname for the book “The C Programming Language”. K&R usually specifically refers to the original release. -A layer edition was made when the C language became ANSI standardized, +A later edition was made when the C language became ANSI standardized, and the cover of that edition is labeled as such.
  4. Goto used to be the primary way to do control flow in programs, -because it closely resembles how how machines and their languages work.

    +because it closely resembles how machines and their languages work.

    For example, to implement a while loop, you might write:

    @@ -703,7 +606,7 @@

    Code

    It doesn’t look bad there, but if you do a lot of control flow (especially using adhoc patterns, besides while) -then it becomes “spagehetti code” +then it becomes “spaghetti code” that is difficult to read and follow. In a complex program, one must essentially read every statement as if you were the computer and jump @@ -712,7 +615,7 @@

    Code

    Dijkstra heavily criticized this approach in his famous paper: “Go to statement considered harmful”.

    -

    Alex is observing that it is a good solution to many problems, +

    Alex is observing that it is a good solution to many problems, especially when used in a restricted context, and not as the primary way to organize programs. Later on he will give examples.

    @@ -721,15 +624,19 @@

    Code

    which allows one to use labels and goto restricted to a specific block. Fast, assembly-like, messy code can be wrapped in nice functional interfaces.

  5. +Alex: If you can sort a little bit, you can sort everything. +Set the second argument to be the same as the third: +std::partial_sort(first, last, last).
  6. +
  7. Alex: There is a perfectly wonderful three line solution which could change partial sort and make it much more acceptable to modern computers but it’s not the standard. It will take the implementers of the standard library -another 15 years to catch up.
  8. -
  9. +another 15 years to catch up.
  10. +
  11. Many programmers imagine the C++ standard library is a package like sqlite or LaTeX that is centrally developed and deployed to many platforms. - This is not the case. +This is not the case. Vendors who want to create a C++ compiler and support it on their platform typically develop their own library implementation in agreement with the standard. There is little or no collaboration on library code between platforms.

    @@ -741,14 +648,14 @@

    Code

    that is the same algorithm wherever I go. Obviously somebody modified them a little bit over these 20 years. -For example, people at Apple did something slightly different from people at GNU.

  12. -
  13. -AMD Ryzen 5 2400G (8 core, 3.6 GHz). GCC 9.3.0
  14. +For example, people at Apple did something slightly different from people at GNU.

  15. +AMD Ryzen 5 2400G (8 core, 3.6 GHz). GCC 9.3.0
  16. +
  17. Alex states that the little bit added on to log(n) is called the Stirling number. This does not appear to be correct, and he probably meant -to refer to the Euler-Masheroni constant.
  18. +to refer to the Euler-Masheroni constant.
diff --git a/04_instrumented.md b/04_instrumented.md index 43fdd63..53d3021 100644 --- a/04_instrumented.md +++ b/04_instrumented.md @@ -6,7 +6,7 @@ Here is a little speech about great language designers. Once upon a time there was a very great language designer called [Niklaus Wirth][wirth]. -He was very brilliant, no question about it, and a very wonderful, kind, charming, and witty, person. +He was very brilliant, no question about it, and a very wonderful, kind, charming and witty person. So he designed his first programming language called [Euler][euler] which he did at [Berkeley][berkeley]. Then it was [ALGOL W][algolw]. @@ -21,19 +21,19 @@ but, he would design a beautiful language, observe that it's limited, throw it a and design a better language. What about the customer base? Well, who cares. This is why we still have so many people programming in -Oberson. Because, by the time he got to Oberon, people got tired. +Oberon. Because, by the time he got to Oberon, people got tired. This is why I claim, that to be honest Bjarne is literally the greatest language designer, at least after [John Backus][backus]. John Backus invented the first useful programming language [Fortran][fortran]. It is still a very useful programming language. Then came C and C++. -Dennis was brilliant. +Dennis was brilliant. He did C and then washed his hands and walked away, which is very wise. I sympathize. -But, Bjarne started working on C++ -in roughly 1978 (and then released in 1981), 35 years ago . +But Bjarne started working on C++ +in roughly 1978 (and then released in 1981), 35 years ago. Then he never abandoned us. It was never perfect, but he would work on it, and work on it, and work on it, and go to horrible meetings of the standard committee, @@ -45,28 +45,28 @@ further, and further, and further, with the most advanced language mechanisms known to humankind. What you can do right now in C++, you cannot really do in any other language. -But, it requires patience, determination, +But it requires patience, determination and genius. Whatever decisions he made in 1979 didn't lead to a stalemate later on. There is some ugly stuff, but you could avoid it. Being able to evolve the language for that long is incredible. I have no other example, not just in language design, but in computer science. -Ken Thompson did Unix but do you think he stayed +Ken Thompson did UNIX but do you think he stayed with UNIX? -No, in his tuning award speech he said he stopped working -on UNIX a long time. +No, in his Turing award speech he said he stopped working +on UNIX a long time ago. It's very difficult. I'm a clear example of a lazy bum. -STL was voted in August 1994, 20 years ago. +STL was voted in August 1994, 20 years ago. How many times did I attend standard committee meetings after that? None. -How many times did I look at proposals related to STL did I do anything related to this? +How many times did I look at proposals related to STL? Did I do anything related to this? Nothing. This is why I have sanity but also this is -why compared with Bjarne, I am a failure. +why, compared with Bjarne, I am a failure. I let people do things with STL that should have been prevented. I did not evolve it. I did not grow it. @@ -75,12 +75,13 @@ I know it's a free country, you can. My advice to most of you, if you want good life follow my example. Because it's very hard to do what Bjarne does. -I cannot point a single other example literally of a person who keeps +I cannot point to a single other example literally of a person who keeps working. [McCarthy][mccarthy] invents Lisp. After [1.5][lisp15] he's gone. He didn't follow. He didn't contribute, and so on. +Everybody does that, because we are weaklings. So C++ is a great accomplishment, but of course it has flaws. @@ -141,7 +142,7 @@ Writing this particular class will teach you once and for all to write ### Redefining regular operations with counting -We're going to write instrumented using the same technique +We're going to write `instrumented` using the same technique we use to write all classes: 1. Copy and paste the [`singleton.h`](code/singleton.h) file we made last time. @@ -149,7 +150,7 @@ we use to write all classes: Now we will do some work to count operations. In the copy constructor, we will initialize value, -and add a line to +and add a line that bumps up the copy count, like this: instrumented(const instrumented& x) : value(x.value) { @@ -170,7 +171,7 @@ When I write code I want to do two things: This line is short, so I like one. I have been changing my programming style depending on the people with whom I work. -[Paul McJones][paul] effected my programming style greatly when I started here. +[Paul McJones][paul] affected my programming style greatly when I started here. For example I never used to use `x`. I avoided short variable names. In my old code everything is called `special_variable_x`. @@ -180,7 +181,7 @@ Continue making similar replacements for the rest of the operations on singleton: ~instrumented() { ++counts[destructor]; } - instrumented& operator=(const instrumented& x) { + instrumented& operator=(const instrumented& x) { ++counts[assignment]; value = x.value; return *this; @@ -193,7 +194,7 @@ on singleton: } // TotallyOrdered friend - bool operator<(const instrumented& x, const instrumented& y) { + bool operator<(const instrumented& x, const instrumented& y) { ++counts[comparison]; return x.value < y.value; } @@ -205,14 +206,14 @@ on singleton: ### Storing counts -What do with all the counts? +What to do with all the counts? Where do they get stored? -Every time this `instrumented` thing happens we want some global count to be incremented. +Every time this `instrumented` thing happens we want some global count to be incremented. We were told that using global variables is bad. If I were doing it just for me, I would have used globals. Old guys don't mind using global variables. They're actually good. -Since you are modern people, we will show you how to do it to do with inheritance. +Since you are modern people, we will show you how to do it with inheritance. We will define a base class to hold this data: struct instrumented_base @@ -229,7 +230,9 @@ We will define a base class to hold this data: static void initialize(size_t); }; - +This is a remarkable example of a class containing nothing. +It is a very useful thing, we will use very many such classes. +It's very cheap to pass things which contain nothing. A static member is a member which is one per class, not one per instance, and they're useful because we don't want to keep count per instance. We want to keep count per class. @@ -254,19 +257,17 @@ This is false. Inheritance is very useful when you inherit from a class containing nothing because it couldn't do any harm. That's what we're going to do here. + template + // T is Semiregular or Regular or TotallyOrdered struct instrumented : instrumented_base -This is a remarkable example of a class containing nothing. -It is a very useful thing, we will use very many such classes. -It's very cheap to pass things which contain nothing. - There is a notorious problem in C++ with static members of templates, it's just not good. -We don't need to inherit from a template all these different +We don't need to inherit from a template. All these different `instrumented`'s inherit from the same base which will contain nothing at all and we will use this as a counting device. -What is good about is we managed not to +What is good about this is we managed not to muck up this nice class. It's basically the same as singleton. It's fundamentally of the same structure and we pushed all of the @@ -274,7 +275,7 @@ statistic collection stuff out into a helper class. ### How should we use enum? -`enum` which is a mechanism which introduces a bunch of constants. +`enum` is a mechanism which introduces a bunch of constants. It's a very evil mechanism. I was wondering who invented enum, because it wasn't in first edition of K&R[^kandr]. @@ -287,8 +288,8 @@ the person who invented it was weird. Whether it worked there correctly or not remains to be seen. Dennis decided to bring it in, -but the issue is that it's not really a type. - C++ attempts make it a type but doesn't quite work. +but the issue is that it's not really a type. +C++ attempts to make it a type but it doesn't quite work. You could have a variable typed with the enum which has three different values and then you take totally different value assigned to it, nothing happens. @@ -299,11 +300,9 @@ But, do not depend on any operations. Never depend on a value of a given enum. - - [^kandr]: K&R (Kernighan and Ritchie) is a nickname for the book ["The C Programming Language"][c-lang]. K&R usually specifically refers to the original release. - A layer edition was made when the C language became ANSI standardized, + A later edition was made when the C language became ANSI standardized, and the cover of that edition is labeled as such. [c-lang]: https://en.wikipedia.org/wiki/The_C_Programming_Language @@ -315,17 +314,17 @@ I use inheritance, when appropriate. In general, I *use any language feature when appropriate*. Paul and I even use `goto` and we're not ashamed. There is a famous statement attributed to Ken Thompson that the fastest way of going from one place in the program to another is by using the `goto` statement, and it is so. -If you implement things like called state machines it's a +If you implement things like state machines it's a wonderful technique, because you have transitions. You go from this state to that state. You could write a loop with some conditional. Or, you could just `goto` and write very beautiful code, at least we believe so. -Everything has its place, Dijkstra's structures not withstanding[^goto]. +Everything has its place, Dijkstra's strictures not withstanding[^goto]. [^goto]: Goto used to be the primary way to do control flow in programs, - because it closely resembles how how machines and their languages work. + because it closely resembles how machines and their languages work. For example, to implement a `while` loop, you might write: @@ -337,7 +336,7 @@ Everything has its place, Dijkstra's structures not withstanding[^goto]. It doesn't look bad there, but if you do a lot of control flow (especially using adhoc patterns, besides `while`) - then it becomes "spagehetti code" + then it becomes "spaghetti code" that is difficult to read and follow. In a complex program, one must essentially read every statement as if you were the computer and jump @@ -346,7 +345,7 @@ Everything has its place, Dijkstra's structures not withstanding[^goto]. Dijkstra heavily criticized this approach in his famous paper: ["Go to statement considered harmful"][goto-paper]. - Alex is observing that it is a good solution to many problems, + Alex is observing that it is a good solution to many problems, especially when used in a restricted context, and not as the primary way to organize programs. Later on he will give examples. @@ -361,7 +360,7 @@ Everything has its place, Dijkstra's structures not withstanding[^goto]. ## Using instrumented to analyze sort -To learn how to use instrumented, let's analyze the performance +To learn how to use `instrumented`, let's analyze the performance of sorting routines in STL. There are a few of them and they all use a distinct algorithm: @@ -383,16 +382,23 @@ and they all use a distinct algorithm: - [`std::partial_sort`](https://en.cppreference.com/w/cpp/algorithm/partial_sort) - For example, you give it 100 elements and you sort from 1st to the - 10th to the last. But, for the last 90 elements there is no guarantee. - Those of you who work on search, know you don't really need to sort everything, - you just need to sort a little bit. + What's the interface? + It takes three iterators `first`, `middle`, and `last`. + What does it do? + It orders the elements so that `first` to `middle` + contains the smallest elements from the entire range, in sorted order. - What do you use for partial sort? + For example, suppose you give it 100 elements and you want to sort from 1st, to the 10th, to the last. + The smallest 10 will be moved to the front in sorted order. + But the last 90 elements will be left in *some unspecified order*. + Those of you who work on search, know you don't really need to sort everything, + you just need to sort a little bit[^partial-complete-sort]. + + What algorithm do you use for partial sort? I'll tell you that it's wrong. - The solution which STL uses was good in 1994, but a bad solution in 2013. + The solution which STL uses was good in 1993, but a bad solution in 2013. It uses [heap sort](https://en.wikipedia.org/wiki/Heapsort). - That's what algorithm books tell you[^heap-sort-correction]. + That's what algorithm books tell you and what I believed was the correct solution[^heap-sort-correction]. We want to compare how these various sort operations perform, @@ -403,9 +409,15 @@ relative to each other. It will take the implementers of the standard library another 15 years to catch up. +[^old-heap-algorithm]: Alex: Computers changed. + The algorithms which worked perfectly wonderfully in 1993, still work in the books. + But they don't work in the computer. +[^partial-complete-sort]: Alex: If you can sort a little bit, you can sort everything. + Set the second argument to be the same as the third: + `std::partial_sort(first, last, last)`. -**Exercise**: With instrumented, compare the number of operations +**Exercise**: With `instrumented`, compare the number of operations between these three kinds of sort[^implementations]. Refer to the code provided at the end of the chapter. A complete test harness is provided which will randomly @@ -439,7 +451,7 @@ Here is a sample of the output for heap sort[^pc-info]: [^implementations]: Many programmers imagine the C++ standard library is a package like `sqlite` or `LaTeX` that is centrally developed and deployed to many platforms. - This is not the case. + This is not the case. Vendors who want to create a C++ compiler and support it on their platform typically develop their own library implementation in agreement with the standard. There is little or no collaboration on library code between platforms. @@ -460,10 +472,10 @@ Here is a sample of the output for heap sort[^pc-info]: Another useful way to study operation counts is by *normalizing the data*. We know the asymptotic complexity of sort algorithms should be `O(n log(n))`. -So, what we can do is normalize the data +So, what we can do is normalize the data to tell us for `n` elements, how many operations were done, per `n log(n)`. -Here is an example of such a normalizing functions. +Here is an example of such a normalizing function: double normalized_by_nlogn(double x, double n) { return x / (n * (log(n) / log(2))); @@ -500,7 +512,7 @@ Here is a sample of data for heap sort with measurements normalized: 8388608 0.20 1.06 0.20 0.00 0.00 1.02 0.00 16777216 0.19 1.06 0.19 0.00 0.00 1.01 0.00 -You remember Knuth (*Author of "The Art of Computer programming"*)? +You remember Knuth (Author of "The Art of Computer Programming")? In the beginning of the first volume when he introduces complexity he tells you how to measure complexity. He says we measure it as a function where we have @@ -519,7 +531,7 @@ we could do more operations without actually incurring more time. Let us talk about possible input shapes. What is a good set of data to test these algorithms on? The most basic one is just to generate uniformly random data. -Another shape to try is a list which is already sort. +Another shape to try is a list which is already sorted. As we'll discover later on, some sorting algorithms are particularly bad for this particular configuration. Both ascending and descending will give different results. @@ -531,7 +543,7 @@ But, eventually we want to define some measure of the ratio of equal to unequal elements. Random shuffle of uniform shuffle of random data is very good, -but it's not a very realistic a distribution. +but it's not a very realistic distribution. One which is very common in real life is called [Zipf distribution][zipf]. Let me describe it incorrectly, first. Assume that the most probable guy comes with probability `1`. @@ -545,7 +557,7 @@ normalize by the sum of the harmonic series up to `n`: The denominator is actually `ln(n) + gamma` where gamma is a small number[^stirling]. -**Exercise:** Introduce variation into the shape of data and compare +**Exercise:** Introduce variation into the shape of data and compare the sorting algorithms again. @@ -563,7 +575,7 @@ the sorting algorithms again. Counting operations is only one measure of performance. If we apply `instrumented` to our problem of finding unique elements -in the first chapter, we will actually find that using +in the first chapter, we will find that using `std::set` actually uses fewer of almost every operation than first sorting with `std::sort` and then calling `std::unique`. diff --git a/05_swap.html b/05_swap.html index 40f661d..df94c58 100644 --- a/05_swap.html +++ b/05_swap.html @@ -2,116 +2,9 @@ + 5. Swap: a fundamental component - + @@ -166,7 +59,7 @@

What is a component?

It’s something which is not specific and could be used by all the applications which need this particular problem solved. Then comes another important question. -People comes to me and say “why don’t we use Go, or Scala, +People come to me and say “why don’t we use Go, or Scala, and many others?” Let us discuss what components are in terms of a programming language. @@ -201,7 +94,7 @@

Relative and absolute efficiency

-

These are two fundamental and different kinds of efficiency.

+

These are two fundamental and different kinds of efficiency1.

People say, “Alex you use C++ because you sold out to dark forces”. If I sold out to dark forces, I wouldn’t be working at my age. @@ -210,7 +103,7 @@

Relative and absolute efficiency

I still program in C++ because as far as I could ascertain it’s the only language which allows me generality and absolute efficiency. I can program as general as I like. -I can talk about things like monoids and semi-groups. +I can talk about things like monoids and semi-groups2. When it compiles I could look at assembly code and see it is good. It is absolutely efficient.

@@ -229,13 +122,13 @@

Three tests of a language’s ability to write components

of whether a language is good enough. I still use it to determine whether a language is suitable for what I want to do or not. There are three programs which I need to -implement in a general way to know that the language is suitable these three +implement in a general way to know that the language is suitable. These three programs are:

  1. swap: takes two things and swaps them.
  2. min: takes two things and figure out which one is smaller.
  3. -
  4. linear search. goes through a bunch of stuff and finds the one you want.
  5. +
  6. linear search: goes through a bunch of stuff and finds the one you want.
@@ -247,45 +140,44 @@

Three tests of a language’s ability to write components

But people always think that exciting things have to be complicated. I claim exciting things tend to be very simple and basic.

-

So you say, “Alex, why don’t we use a new language?” +

So you say, “Alex, why don’t we use a new language?” Go try implementing these three program in your favorite language. Do them in a general way. -If they’re at least relatively efficient, that is, they are not slower than specific things -written in the language, then let us talk. +If they’re at least relatively efficient, that is, they are not slower than specific things written in the language, then let us talk. If you cannot do it, let us stick with C++. -I’m just explaining the reasoning behind my my choice of C++.

+I’m just explaining the reasoning behind my choice of C++.

Swap

Let us look at these three programs. -Why they are important? +Why are they important? Why is swap important? What does it deal with? Apparently it’s not self-evident. -Once upon a time I was talking to a very famous program, +Once upon a time I was talking to a very famous programmer, supposedly the best programmer A9 ever had. -I told him about these three things and he looks at me +I told him about these three things and he looks at me and said, “I never had to use swap in my life”. I don’t know… I was very impressed because you swap for sorting, for reversing the sequence, for rotating the sequence, -for all kind of operations. -Basically, if you do something with a -sequence you swap. - so it is very important practically. -But, it also happens to be very important -theoretically because a long time ago when people were starting group theory1 -They discovered that any permutation of a sequence could be generated out of swap2. +for all kinds of operations. +Basically if you do something with a +sequence, you swap. +So it is very important practically. +But it also happens to be very important +theoretically, because a long time ago when people were starting group theory +they discovered that any permutation of a sequence could be generated out of swap3. Swap is the most primitive operation. -the reason is sequence, and any other -permutation, can be constructed out of swap.

+The reason is sequence. And any other +permutation can be constructed out of swap.

-

But, apparently not everyone, even famous programmers, +

But apparently not everyone (even famous programmers) realized that. Well, he had to claim that the language he thought was the greatest language was great, and since it couldn’t do swap, -what do you? +what do you do? You deny the utility of swap.

@@ -313,7 +205,7 @@

General swap

Specialized swap

For some types, this swap will perform poorly. -Could you give an example of it is horribly inefficient? +Could you give an example of it being horribly inefficient? What about a large container? Consider:

@@ -324,14 +216,14 @@

Specialized swap

It will construct a temporary vector and copy every single element into that temporary, and then back -(several million operations)3.

+(several million operations)4.

So we have generic code which works everywhere, except it’s very slow. What should we do if someone says, “I have a wonderful generic solution, very abstract, but it takes a million iterations when there should be three.”? Throw him out. There is no excuse. -Then he says, “oh, but I could use tropical semirings”, -take tropical semirings and do something to him and them4.

+Then he says, “Oh, but I could use tropical semirings”. +Take tropical semirings and do something to him and them5.

If you think about the algorithm that needs to take place, it requires knowledge of how vector actually stores data. @@ -340,7 +232,7 @@

Specialized swap

A central feature of a container is ownership of the elements. So the elements and container go together. -For things of this kind, we nee to write a special swap.

+For things of this kind, we need to write a special swap.

template<typename T>
 // T is Semi-Regular
@@ -352,7 +244,7 @@ 

Specialized swap

It would be wonderful to be able to just type those comments and the -and compiler will do it for us sadly enough we’re not there yet.

+compiler will do it for us. Sadly enough we’re not there yet.

When we write a special version of this function, it is called partial template specialization. @@ -364,7 +256,7 @@

XOR swap

What if we don’t have an extra memory location? Could we write swap? -Yes, there is a beautiful algorithm using XOR5.

+Yes, there is a beautiful algorithm using XOR6.

template<typename T>
 // T is UnsignedIntegral
@@ -377,26 +269,26 @@ 

XOR swap

Exercise: Prove swap_xor is correct. To do this you will need to discover - some basic properties of ^. (See solution6.)

+ some basic properties of ^. (See solution7.)

What are the requirements for this algorithm? -Specifically, what types have an XOR operator ^. +Specifically, what types have an XOR operator ^? Could we use it on int? Yes, but it’s a bad idea. The language standard says that the -result of XOR for the signed bit is not defined. +result of XOR for the sign bit is not defined. If it is a positive integer you know what is going on for the sign bits. When it’s negative you have no idea.

So use it for unsigned int, or char. Basically, unsigned integral types. -So, it’s not particularly useful.

+So it’s not particularly useful.

-

But, there is a case where it doesn’t work, +

But there is a case where it doesn’t work, which is weird because we have a proof it does work (if you did the exercise above). -But, in our proof we made the +In our proof we made the small assumption that x and y are different objects. Because if they happen to be the same object, the value it contains at -the end of this function will be always be zero. +the end of this function will always be zero. Every bit will be zapped completely totally and absolutely.

We could fix this by wrapping the body in:

@@ -408,7 +300,7 @@

XOR swap

Is it a good idea? In this case, it’s important for correctness. -But, be careful. +But be careful. We should never add it to the other swap because it adds more work to the common case.

@@ -422,7 +314,7 @@

XOR swap

}
-

But, if you optimize for 5, and you have more than three integers, +

But if you optimize for 5, and you have more than three integers, it will seldom be 5 and you will be doing the check all the time.

So two rules:

@@ -439,28 +331,28 @@

Is the inline keyword important?

inline is one of the things which will go away. There are certain things in C and C++ which are there because compiler technology was imperfect. When I started in C++ in 1986 -I had to write the keyword register everywhere because believe it or -not compilers wouldn’t use registers7 unless you specifically indicated that something goes into the register. +I had to write the keyword register everywhere because, believe it or +not, compilers wouldn’t use registers8 unless you specifically indicated that something goes into the register. Of course if it went into the register you could never use address operator & because obviously registers do not have addresses. It was a very special thing you needed to worry about. It was important in my measurements at the time. -Stripping register declarations from fundamental algorithms caused a slow down by a factor of three. +Stripping register declarations from fundamental algorithms caused a slowdown by a factor of three. Why? For every call to ++ the assembly code first did a load and then after it stored the result. -At that time computers used to do one thing at the time. +At that time computers used to do one thing at a time. So by adding a load and store around everything, it basically tripled the time.

This is no longer true, meaning that computers no longer execute one operation at a time, as we will discover. For sure, you never need to worry about registers. -In modern computers, this is utterly idiotic you should never do it. -In the same way the compiler is perfectly theoretically capable +In modern computers this is utterly idiotic; you should never do it. +In the same way the compiler is perfectly theoretically capable of figuring out what needs to be inline, much more than you.

But, we’re living in -this transition time. +this transition time. I think about five years from now you will never need to write inline. Compilers will do it. @@ -468,7 +360,7 @@

Is the inline keyword important?

You remove this inline and you could get enormous performance difference. It could be a factor of 10 for something like swap. -The problem is that the function call sequences, +The problem is that the function call sequence, is a bad sequence.

@@ -482,16 +374,46 @@

Code


  1. -

    Group theory -is one of the main subjects of abstract algebra. -The key idea is that many mathematical structures behave similarly. -You can add and subtract numbers, you can add and subtract vectors and matrices. -Can we study all structures which can add and subtract all together? -It turns out you can, and one such structure is a group.

    - -

    Alex’s ideas about generic programming are inspired -by abstract algebra.

  2. +

    Bjarne: +“[Alex] defined the abstraction penalty as the ratio of runtime between a templated operation +(say, find on a vector<int>) and the trivial nontemplated equivalent (say a loop over an array of int). +An implementation that does all of the easy and obvious optimizations gets a ratio of 1. +Poor compilers had an abstraction penalty of 3, though even then good implementations did significantly better. +In October 1995, to encourage implementers to do better, +Alex wrote the “abstraction penalty benchmark”, which simply measured the abstraction penalty. +Compiler and optimizer writers didn’t like their implementations to be obviously poor, +so today ratios of 1.02 or so are common.” (“Evolving a language in and for the real world: C++ 1991-2006”)

  3. +

    Groups, monoids, and rings are a few of the subjects of abstract algebra, +a field which studies the fundamental properties of mathematical structures. +The key idea is that many different mathematical objects appear to function similarly. +Vectors and matrices can be “added” and “subtracted” just like integers. +In what ways are they fundamentally the same? +One explanation is that all of them form a group. +Below is a formal definition:

    + +

    A group is a set G with a binary operation * : G x G -> G such that:

    + +
      +
    1. G contains an identity element e in G such that e * x = x * e = x for all x in G.
    2. +
    3. The operation * is associative. So ((x * y) * z) = (x * (y * z)) for all x, y, z in G.
    4. +
    5. Every element x in G has an inverse element y such that x * y = y * x = e.
    6. +
    + + +

    For example integers are a group with the operation of addition and the identity element 0.

    + +
      +
    1. 0 + x = x + 0 = x
    2. +
    3. ((x + y) + z) = (x + (y + z)).
    4. +
    5. x + (-x) = (-x) + x = 0.
    6. +
    + + +

    The process of discovering and applying generic concepts is very similar. +Alex introduces the basics of abstract algebra, from a programmers perspective, +in his book “From Mathematics to Generic Programming”.

  4. +
  5. A permutation is a bijection (1-1, onto) map from a set to itself. For example we can define the following permutation on the first @@ -513,8 +435,8 @@

    Code

    where g(1) = 3 and g(3) = 1 and h(1) = 2 and h(2) = 1. -
  6. -
  7. +
  8. +
  9. Since C++11 this issue has been addressed by move semantics and even the swap as we have written it may perform well.

    @@ -532,16 +454,15 @@

    Code

    On the other hand, it solves this problem generally, so that one does not need to write custom swaps for every data structure. -It shifts the responsbility to the data structure author, -instead of every algorithm which might use it.

  10. -
  11. +It shifts the responsibility to the data structure author, +instead of every algorithm which might use it.

  12. +
  13. Alex himself uses Tropical semi-rings to describe several algorithms in his book “From Mathematics to Generic Programming” (See chapter 8.6). -So his issue here is not algebraic abstractions, but pursing abstraction -with enormous cost.
  14. -
  15. -

    The ^ symbol is bitwise exclusive or. -The statement a XOR b means a is true, or b is true, but not both. +So his issue here is not abstraction itself, rather that it can become too costly.

  16. +
  17. +

    The ^ symbol is bitwise exclusive or. +The expression a ^ b means a is true, or b is true, but not both. It is defined by the following truth table:

    a | b | a XOR b
    @@ -550,8 +471,8 @@ 

    Code

    1 0 1 0 1 1 1 1 0 -
  18. -
  19. +
  20. +
  21. We need to use the fact that XOR is associative and commutative.

    Proof:

    @@ -566,19 +487,19 @@

    Code

    = b ^ (a ^ a) = b ^ 0 = b -
  22. -
  23. +
  24. +
  25. In CPU architecture, a register is a slot on the CPU -that can store a single value (typically 32 bit or 64 bits). +that can store a single value (typically 32 or 64 bits). Most CPU operations are confined to operating on values in registers. For example, an “add” instruction might add the value in one register, -to another register, and then stores the value in a third register. +to another register, and then store the value in a third register. Separate “load” and “store” instructions are used to move a value between a register and a location in memory.

    -

    Typically a CPU has only a handful (fewer than 50) of register, so a large part of the program +

    Typically a CPU has only a handful of registers (fewer than 50), so a large part of the program is spent moving values from memory into registers so they can be operated on, -and then transferring the results back to main memory.

  26. +and then transferring the results back to main memory.

diff --git a/05_swap.md b/05_swap.md index ed36663..a77467f 100644 --- a/05_swap.md +++ b/05_swap.md @@ -42,7 +42,7 @@ A component is something which solves a problem in a general way. It's something which is not specific and could be used by all the applications which need this particular problem solved. Then comes another important question. -People comes to me and say "why don't we use [Go][go], or [Scala][scala], +People come to me and say "why don't we use [Go][go], or [Scala][scala], and many others?" Let us discuss what components are in terms of a programming language. @@ -71,7 +71,7 @@ as a non-generic non-component written in the same language. which could be done on a given machine. Basically you know it's as fast as assembly language. -These are two fundamental and different kinds of efficiency. +These are two fundamental and different kinds of efficiency[^absolute-efficiency-history]. People say, "Alex you use C++ because you sold out to dark forces". If I sold out to dark forces, I wouldn't be working at my age. @@ -80,10 +80,21 @@ After all, I didn't start with C++. I still program in C++ because as far as I could ascertain it's the only language which allows me generality and absolute efficiency. I can program as general as I like. -I can talk about things like [monoids][monoid] and [semi-groups][semi-group]. +I can talk about things like [monoids][monoid] and [semi-groups][semi-group][^about-group-theory]. When it compiles I could look at assembly code and see it is good. It is absolutely efficient. +[^absolute-efficiency-history]: + Bjarne: + "[Alex] defined **the abstraction penalty** as the ratio of runtime between a templated operation + (say, find on a `vector`) and the trivial nontemplated equivalent (say a loop over an array of `int`). + An implementation that does all of the easy and obvious optimizations gets a ratio of 1. + Poor compilers had an abstraction penalty of 3, though even then good implementations did significantly better. + In October 1995, to encourage implementers to do better, + Alex wrote the “abstraction penalty benchmark”, which simply measured the abstraction penalty. + Compiler and optimizer writers didn’t like their implementations to be obviously poor, + so today ratios of 1.02 or so are common." (["Evolving a language in and for the real world: C++ 1991-2006"](papers/evolving-a-language.pdf)) + [go]: https://en.wikipedia.org/wiki/Go_(programming_language) [scala]: https://en.wikipedia.org/wiki/Scala_(programming_language) [turing]: https://en.wikipedia.org/wiki/Turing_completeness @@ -104,12 +115,12 @@ A long time ago I came up with a very simple test of whether a language is good enough. I still use it to determine whether a language is suitable for what I want to do or not. There are three programs which I need to -implement in a general way to know that the language is suitable these three +implement in a general way to know that the language is suitable. These three programs are: 1. `swap`: takes two things and swaps them. 2. `min`: takes two things and figure out which one is smaller. -2. `linear search`. goes through a bunch of stuff and finds the one you want. +3. `linear search`: goes through a bunch of stuff and finds the one you want. Aren't these too simple? If we cannot do simple things, it is very unlikely we will be able to do hard things. @@ -119,56 +130,70 @@ I say, look I'm not interested, because I want to see solutions to simple proble But people always think that exciting things have to be complicated. I claim exciting things tend to be very simple and basic. -So you say, "Alex, why don't we use a new language?" +So you say, "Alex, why don't we use a new language?" Go try implementing these three program in your favorite language. Do them in a general way. -If they're at least relatively efficient, that is, they are not slower than specific things -written in the language, then let us talk. +If they're at least relatively efficient, that is, they are not slower than specific things written in the language, then let us talk. If you cannot do it, let us stick with C++. -I'm just explaining the reasoning behind my my choice of C++. +I'm just explaining the reasoning behind my choice of C++. ## Swap Let us look at these three programs. -Why they are important? +Why are they important? Why is swap important? What does it deal with? Apparently it's not self-evident. -Once upon a time I was talking to a very famous program, +Once upon a time I was talking to a very famous programmer, supposedly the best programmer A9 ever had. -I told him about these three things and he looks at me +I told him about these three things and he looks at me and said, "I never had to use swap in my life". I don't know... I was very impressed because you swap for sorting, for reversing the sequence, for rotating the sequence, -for all kind of operations. -Basically, if you do something with a -sequence you swap. - so it is very important practically. -But, it also happens to be very important -theoretically because a long time ago when people were starting group theory[^group-theory] -They discovered that any permutation of a sequence could be generated out of swap[^permutation]. +for all kinds of operations. +Basically if you do something with a +sequence, you swap. +So it is very important practically. +But it also happens to be very important +theoretically, because a long time ago when people were starting [group theory][group-theory] +they discovered that any permutation of a sequence could be generated out of swap[^permutation]. Swap is the most primitive operation. -the reason is sequence, and any other -permutation, can be constructed out of swap. +The reason is sequence. And any other +permutation can be constructed out of swap. -But, apparently not everyone, even famous programmers, +But apparently not everyone (even famous programmers) realized that. Well, he had to claim that the language he thought was the greatest language was great, and since it couldn't do swap, -what do you? +what do you do? You deny the utility of swap. -[^group-theory]: [Group theory](https://en.wikipedia.org/wiki/Group_theory) - is one of the main subjects of abstract algebra. - The key idea is that many mathematical structures behave similarly. - You can add and subtract numbers, you can add and subtract vectors and matrices. - Can we study all structures which can add and subtract all together? - It turns out you can, and one such structure is a group. +[group-theory]: https://en.wikipedia.org/wiki/Group_theory + +[^about-group-theory]: Groups, monoids, and rings are a few of the subjects of abstract algebra, + a field which studies the fundamental properties of mathematical structures. + The key idea is that many different mathematical objects appear to function similarly. + Vectors and matrices can be "added" and "subtracted" just like integers. + In what ways are they fundamentally the same? + One explanation is that all of them form a group. + Below is a formal definition: + + A **group** is a set `G` with a binary operation `* : G x G -> G` such that: + + 1. `G` contains an identity element `e` in `G` such that `e * x = x * e = x` for all `x` in `G`. + 2. The operation `*` is associative. So `((x * y) * z) = (x * (y * z))` for all `x, y, z` in `G`. + 3. Every element `x` in `G` has an inverse element `y` such that x * y = y * x = e. + + For example integers are a group with the operation of addition and the identity element 0. - Alex's ideas about generic programming are inspired - by abstract algebra. + 1. `0 + x = x + 0 = x` + 2. `((x + y) + z) = (x + (y + z))`. + 3. `x + (-x) = (-x) + x = 0`. + The process of discovering and applying generic concepts is very similar. + Alex introduces the basics of abstract algebra, from a programmers perspective, + in his book "From Mathematics to Generic Programming". [^permutation]: A [permutation](https://en.wikipedia.org/wiki/Permutation_group) is a bijection (1-1, onto) map from a set to itself. @@ -213,7 +238,7 @@ for it, you don't want to worry about whether `inline` is on the front or not. ### Specialized swap For some types, this `swap` will perform poorly. -Could you give an example of it is horribly inefficient? +Could you give an example of it being horribly inefficient? What about a large container? Consider: @@ -229,8 +254,8 @@ So we have generic code which works everywhere, except it's very slow. What should we do if someone says, "I have a wonderful generic solution, very abstract, but it takes a million iterations when there should be three."? Throw him out. There is no excuse. -Then he says, "oh, but I could use [tropical semirings][tropical]", -take tropical semirings and do something to him and them[^tropical]. +Then he says, "Oh, but I could use [tropical semirings][tropical]". +Take tropical semirings and do something to him and them[^tropical]. If you think about the algorithm that needs to take place, it requires *knowledge of how vector actually stores data*. @@ -239,7 +264,7 @@ but the pointers to their contents should be swapped. A central feature of a container is ownership of the elements. So the elements and container go together. -For things of this kind, we nee to write a special swap. +For things of this kind, we need to write a special swap. template // T is Semi-Regular @@ -250,7 +275,7 @@ For things of this kind, we nee to write a special swap. } It would be wonderful to be able to just type those comments and the -and compiler will do it for us sadly enough we're not there yet. +compiler will do it for us. Sadly enough we're not there yet. When we write a special version of this function, it is called **partial template specialization**. @@ -261,8 +286,7 @@ and the `T` parameter is still generic. [^tropical]: Alex himself uses Tropical semi-rings to describe several algorithms in his book "From Mathematics to Generic Programming" (See chapter 8.6). - So his issue here is not algebraic abstractions, but pursing abstraction - with enormous cost. + So his issue here is not abstraction itself, rather that it can become too costly. [^move]: Since C++11 this issue has been addressed by [move semantics](https://en.cppreference.com/w/cpp/language/move_constructor) @@ -282,7 +306,7 @@ and the `T` parameter is still generic. On the other hand, it solves this problem generally, so that one does not need to write custom swaps for every data structure. - It shifts the responsbility to the data structure author, + It shifts the responsibility to the data structure author, instead of every algorithm which might use it. ## XOR swap @@ -304,24 +328,24 @@ Yes, there is a [beautiful algorithm][xor-swap] using XOR[^xor]. some basic properties of `^`. (See solution[^xor-proof].) What are the requirements for this algorithm? -Specifically, what types have an `XOR` operator `^`. +Specifically, what types have an `XOR` operator `^`? Could we use it on `int`? Yes, but it's a bad idea. The language standard says that the -result of `XOR` for the signed bit is not defined. +result of `XOR` for the sign bit is not defined. If it is a positive integer you know what is going on for the sign bits. When it's negative you have no idea. So use it for `unsigned int`, or `char`. Basically, unsigned integral types. -So, it's not particularly useful. +So it's not particularly useful. -But, there is a case where it doesn't work, +But there is a case where it doesn't work, which is weird because we have a proof it does work (if you did the exercise above). -But, in our proof we made the +In our proof we made the small assumption that `x` and `y` are different objects. Because if they happen to be the same object, the value it contains at -the end of this function will be always be zero. +the end of this function will always be zero. Every bit will be zapped completely totally and absolutely. We could fix this by wrapping the body in: @@ -332,7 +356,7 @@ We could fix this by wrapping the body in: Is it a good idea? In this case, it's important for correctness. -But, be careful. +But be careful. We should never add it to the other swap because it adds more work to the common case. @@ -345,7 +369,7 @@ So you add to your code: // do normal path } -But, if you optimize for 5, and you have more than three integers, +But if you optimize for 5, and you have more than three integers, it will seldom be 5 and you will be doing the check all the time. So two rules: @@ -354,8 +378,8 @@ So two rules: 2. Don't optimize uncommon cases -[^xor]: The ^ symbol is bitwise [exclusive or](https://en.wikipedia.org/wiki/Exclusive_or). - The statement `a XOR b` means a is true, or b is true, but not both. +[^xor]: The `^` symbol is bitwise [exclusive or](https://en.wikipedia.org/wiki/Exclusive_or). + The expression `a ^ b` means `a` is true, or `b` is true, but not both. It is defined by the following truth table: a | b | a XOR b @@ -386,28 +410,28 @@ So two rules: `inline` is one of the things which will go away. There are certain things in C and C++ which are there because compiler technology was imperfect. When I started in C++ in 1986 -I had to write the keyword [`register`][register] everywhere because believe it or -not compilers wouldn't use registers[^registers] unless you specifically indicated that something goes into the register. +I had to write the keyword [`register`][register] everywhere because, believe it or +not, compilers wouldn't use registers[^registers] unless you specifically indicated that something goes into the register. Of course if it went into the register you could never use address operator `&` because obviously registers do not have addresses. It was a very special thing you needed to worry about. It was important in my measurements at the time. -Stripping `register` declarations from fundamental algorithms caused a slow down by a factor of three. +Stripping `register` declarations from fundamental algorithms caused a slowdown by a factor of three. Why? For every call to `++` the assembly code first did a load and then after it stored the result. -At that time computers used to do one thing at the time. +At that time computers used to do one thing at a time. So by adding a load and store around everything, it basically tripled the time. This is no longer true, meaning that computers no longer execute one operation at a time, as we will discover. For sure, you never need to worry about registers. -In modern computers, this is utterly idiotic you should never do it. -In the same way the compiler is perfectly theoretically capable +In modern computers this is utterly idiotic; you should never do it. +In the same way the compiler is perfectly theoretically capable of figuring out what needs to be `inline`, much more than you. But, we're living in -this transition time. +this transition time. I think about five years from now you will never need to write `inline`. Compilers will do it. @@ -415,7 +439,7 @@ Right now it still makes a difference. You remove this `inline` and you could get enormous performance difference. It could be a factor of 10 for something like swap. -The problem is that the function call sequences, +The problem is that the function call sequence, is a bad sequence. @@ -423,14 +447,14 @@ is a bad sequence. [^registers]: In CPU architecture, a register is a slot on the CPU - that can store a single value (typically 32 bit or 64 bits). + that can store a single value (typically 32 or 64 bits). Most CPU operations are confined to operating on values in registers. For example, an "add" instruction might add the value in one register, - to another register, and then stores the value in a third register. + to another register, and then store the value in a third register. Separate "load" and "store" instructions are used to move a value between a register and a location in memory. - Typically a CPU has only a handful (fewer than 50) of register, so a large part of the program + Typically a CPU has only a handful of registers (fewer than 50), so a large part of the program is spent moving values from memory into registers so they can be operated on, and then transferring the results back to main memory. @@ -438,4 +462,3 @@ is a bad sequence. ## Code - [swap.h](code/swap.h) - diff --git a/06_min_max.html b/06_min_max.html index fd990ca..d779afc 100644 --- a/06_min_max.html +++ b/06_min_max.html @@ -2,116 +2,9 @@ -6. Ordering, min, and max. - + +6. Ordering, min, and max + @@ -127,8 +20,8 @@ - -

6. Ordering, min, and max.

+ +

6. Ordering, min, and max

Learning to design code

@@ -139,7 +32,7 @@

Learning to design code

you to think so that you could design something equal or better. So most of the algorithms we are looking at are in STL, but they are not exposed. -They’re beyond what the standard Committee would ever consider. +They’re beyond what the C++ Standard Committee would ever consider. You might say, “Alex this is a simple problem. Couldn’t you show us how to build a search engine?” Not in a class.

@@ -164,7 +57,7 @@

Reviewing Total Orderings

It’s important to note that this was a choice I made. They give you roughly the same universe of things, so you could design it around either. -But, somewhow I felt < is a more fundamental relation. +But somehow I felt < is a more fundamental relation. < requires a little less typing. There are other reasons, too.

@@ -180,10 +73,10 @@

Reviewing Total Orderings

Axiom 4: If a != b then a < b or b > a.

-

This is also called the trichomoty law, +

This is also called the trichotomy law, because for all elements, exactly one of three things must be true1:

-
(a < b) or (b < a) or (a = b)
+
(a < b) or (b < a) or (a == b)
 

There is a fundamental connection between < and ==. @@ -218,17 +111,17 @@

Weak orderings

There is some equivalence relation, such as last name.

Some people would say, well equal is a kind of equivalence, -so let’s define < to just be weak ordering. +so let’s define < to just be weak ordering. I say it’s evil. Why? Because for TotallyOrdered we need to be able to know

-
!(a < b)  <=> (a >= b)
+
!(a < b) <=> (a >= b)
 

The == is equality, not another equivalence relation. We can’t conclude that with a weak ordering. -We must overloaded symbols for what they commonly mean.

+We must overload symbols for what they commonly mean.

Min

@@ -259,9 +152,9 @@

When elements are equal

What should we return when a == b? It seems it doesn’t matter. -But, that’s the problem. +But that’s the problem. Everywhere in programming you do something and it seems to be correct. -But, you have to think deeply and then you discover a problem. +But you have to think deeply and then you discover a problem. There is nothing little in programming.

Let me construct a proof for you. @@ -271,7 +164,7 @@

When elements are equal

It’s a good requirement.

Another requirement is if I sort two things, -The first guy should be the min afterwards +the first guy should be the min afterwards and the second guy should be the max. We agreed that default ordering for sorting should be ascending. Zero, one, two, three is natural. @@ -291,15 +184,14 @@

When elements are equal

When you design a function you often think about just this function, and ignore how it interacts with other parts of your API. Then you discover inconsistencies which can be very subtle but painful. -There are several profound mistakes in STL +There are several profound mistakes in STL. They are still in the standard, despite all my attempts to change them. It’s very easy to make a mistake and it’s really hard to fix it.

Correct implementation

-

Now we have a correct version for TotallyOrdered, let’s generalize it - for StrictWeakOrdering. +

Now we have a correct version for TotallyOrdered, let’s generalize it for StrictWeakOrdering. We no longer can rely on the < operator on the type, as there may be many orderings and equivalence relations on a type.

@@ -308,9 +200,9 @@

Correct implementation

inline const T& min(const T& a, const T& b, Compare cmp) { if (cmp(b, a)) { - return a; + return b; } else { - return b; + return a; } }
@@ -322,7 +214,7 @@

Correct implementation

  • Functionality: Making it a type would allow it to have state. Pointers to functions have no state.

  • Performance: If we were passing a pointer it would have to do a function call through the pointer. The function call has to save and restore registers. -It’s slow especially if it sits inside a loop and is called a gazallion times.

  • +It’s slow especially if it sits inside a loop and is called a gazillion times.

    @@ -332,7 +224,7 @@

    Less than function object

    It’s somewhat inconvenient to pass cmp when you actually want to use a TotallyOrdered type. Therefore there should be a version of min which doesn’t take that parameter. The wrong way is to use a default template argument. -Sometimes you want to get a pointer to function the function min +Sometimes you want to get a pointer to the function min with a comparison function inserted. For example:

    @@ -395,10 +287,10 @@

    Max

    Let us see why.

    It seems that max is just min with >. -So wh do we need it? +So why do we need it? We still want to provide what is convenient for the customer. When they think max and go looking for it, it should somehow work. -But, its a little bit more.

    +But, it’s a little bit more.

    Sort2

    @@ -416,11 +308,11 @@

    Sort2

    }
    -

    It’s always preferrable to sort in-place because we can obtain a composable +

    It’s always preferable to sort in-place because we can obtain a composable one by first copying, and then applying the in-place algorithm.

    Note once again the order of comparison. -We have to be careful that aren’t going to swap when they are equal. +We have to be careful that we aren’t going to swap when they are equal. I want the following invariant, after sort2 a contains min and b contains max. It’s very natural. @@ -463,7 +355,7 @@

    Implementation

    committee he always says, “Alex you are too theoretical”. I guess I am because I pay attention to these little details. -But I claim to be able to to show you that +But I claim to be able to show you that things like that will matter.

    Later we will generalize it on a bunch of things. @@ -486,7 +378,7 @@

    Fundamental logical laws are not always obeyed

    We assume these statements are true, but it’s not true. -There are exceptions whic cause enourmous amounts of harm +There are exceptions which cause enormous amounts of harm and break all the laws of equality and ordering.

    double x(0.0/0.0);
    @@ -502,7 +394,7 @@ 

    Fundamental logical laws are not always obeyed

    The problem actually appears when people do complicated things. Perhaps they do millions of computations and then sort them. Sort assumes that equality and inequality work like -they should and bad things happen.

    +they should and bad things happen.

    The IEEE floating point standard is one of the great accomplishments of computer science. One of the top five. @@ -535,8 +427,8 @@

    Fundamental logical laws are not always obeyed

    I’m advocating the second one. We keep the laws and define singularities. If there are singular values, the universe collapses, -you know nothing applies. -you have to assure that singular values do not appear in your computation.

    +you know nothing applies. +You have to assure that singular values do not appear in your computation.

    Final code

    @@ -559,7 +451,7 @@

    Final code

    Neither is { 1, 2, 3 } a subset of { 1, 4 }. The two elements are incomparable.
  • -Function objects like this in C++ fulfill +Function objects like this in C++ fulfil the same role as closures or lambdas. They capture variables or state, and then use them to evaluate the function. The main difference is that their saved context is explicit rather diff --git a/06_min_max.md b/06_min_max.md index c0e23db..a9e2028 100644 --- a/06_min_max.md +++ b/06_min_max.md @@ -1,4 +1,4 @@ -6. Ordering, min, and max. +6. Ordering, min, and max ============================= ## Learning to design code @@ -9,7 +9,7 @@ I want to teach you to think so that you could design something equal or better. So most of the algorithms we are looking at are in STL, but they are not exposed. -They're beyond what the standard Committee would ever consider. +They're beyond what the C++ Standard Committee would ever consider. You might say, "Alex this is a simple problem. Couldn't you show us how to build a search engine?" Not in a class. @@ -34,7 +34,7 @@ instead of `<=`? It's important to note that this was a choice I made. They give you roughly the same universe of things, so you could design it around either. -But, somewhow I felt `<` is a more fundamental relation. +But somehow I felt `<` is a more fundamental relation. `<` requires a little less typing. There are other reasons, too. @@ -48,12 +48,12 @@ It's guaranteed. **Axiom 3:** Anti-symmetric: If `a < b` then `!(b < a)`. -**Axiom 4:** If `a != b` then `a < b` or ` b > a`. +**Axiom 4:** If `a != b` then `a < b` or `b > a`. -This is also called the trichomoty law, +This is also called the trichotomy law, because for all elements, exactly one of three things must be true[^eop-ordering]: - (a < b) or (b < a) or (a = b) + (a < b) or (b < a) or (a == b) There is a fundamental connection between `<` and `==`. If `!(b < a)` then it must be the case that `b >= a`. @@ -88,16 +88,16 @@ There is some equivalence relation, such as last name. Some people would say, well equal is a kind of equivalence, -so let's define `<` to just be weak ordering. +so let's define `<` to just be weak ordering. I say it's evil. Why? Because for `TotallyOrdered` we need to be able to know - !(a < b) <=> (a >= b) + !(a < b) <=> (a >= b) The `==` is equality, not another equivalence relation. We can't conclude that with a weak ordering. -We must overloaded symbols for what they commonly mean. +We must overload symbols for what they commonly mean. [^eop-ordering]: Alex recommends chapter 4 of "Elements of Programming" to learn more about this subject. @@ -135,9 +135,9 @@ and 5 and 3 are literals which are constant. What should we return when `a == b`? It seems it doesn't matter. -But, that's the problem. +But that's the problem. Everywhere in programming you do something and it seems to be correct. -But, you have to think deeply and then you discover a problem. +But you have to think deeply and then you discover a problem. There is nothing little in programming. Let me construct a proof for you. @@ -147,7 +147,7 @@ Nothing should be swapped. It's a good requirement. Another requirement is if I sort two things, -The first guy should be the `min` afterwards +the first guy should be the `min` afterwards and the second guy should be the `max`. We agreed that default ordering for sorting should be ascending. Zero, one, two, three is natural. @@ -166,14 +166,13 @@ So let's correct it, so we don't swap unless necessary. When you design a function you often think about just this function, and ignore how it interacts with other parts of your API. Then you discover inconsistencies which can be very subtle but painful. -There are several profound mistakes in STL +There are several profound mistakes in STL. They are still in the standard, despite all my attempts to change them. It's very easy to make a mistake and it's really hard to fix it. ### Correct implementation -Now we have a correct version for `TotallyOrdered`, let's generalize it - for `StrictWeakOrdering`. +Now we have a correct version for `TotallyOrdered`, let's generalize it for `StrictWeakOrdering`. We no longer can rely on the `<` operator on the type, as there may be many orderings and equivalence relations on a type. @@ -182,9 +181,9 @@ orderings and equivalence relations on a type. inline const T& min(const T& a, const T& b, Compare cmp) { if (cmp(b, a)) { - return a; + return b; } else { - return b; + return a; } } @@ -195,14 +194,14 @@ There are two reasons: 2. **Performance:** If we were passing a pointer it would have to do a function call through the pointer. The function call has to save and restore registers. - It's slow especially if it sits inside a loop and is called a gazallion times. + It's slow especially if it sits inside a loop and is called a gazillion times. ## Less than function object It's somewhat inconvenient to pass `cmp` when you actually want to use a `TotallyOrdered` type. Therefore there should be a version of `min` which doesn't take that parameter. The wrong way is to use a default template argument. -Sometimes you want to get a pointer to function the function `min` +Sometimes you want to get a pointer to the function `min` with a comparison function inserted. For example: @@ -218,7 +217,6 @@ So we write a second interface: return min(a, b, std::less()); } - Let's implement a standard class called [`std::less`][cpp-less]. It overrides the evaluation operator so it can be called just like a function[^function-objects]. @@ -262,7 +260,7 @@ Remember the faster computers get, the slower function calls are. Does it matter? Not at all. -[^function-objects]: Function objects like this in C++ fulfill +[^function-objects]: Function objects like this in C++ fulfil the same role as [closures or lambdas][sicp-env]. They capture variables or state, and then use them to evaluate the function. The main difference is that their saved context is explicit rather @@ -283,10 +281,10 @@ So, the one in the standard is still broken. Let us see why. It seems that `max` is just `min` with `>`. -So wh do we need it? +So why do we need it? We still want to provide what is convenient for the customer. When they think `max` and go looking for it, it should somehow work. -But, its a little bit more. +But, it's a little bit more. ### Sort2 @@ -302,11 +300,11 @@ To see how they should all work, let's write `sort2`, which sorts two things. } } -It's always preferrable to sort in-place because we can obtain a composable +It's always preferable to sort in-place because we can obtain a composable one by first copying, and then applying the in-place algorithm. Note once again the order of comparison. -We have to be careful that aren't going to swap when they are equal. +We have to be careful that we aren't going to swap when they are equal. I want the following invariant, after `sort2` `a` contains min and `b` contains max. It's very natural. @@ -346,7 +344,7 @@ Every time I talk to one member of the standard committee he always says, "Alex you are too theoretical". I guess I am because I pay attention to these little details. -But I claim to be able to to show you that +But I claim to be able to show you that things like that will matter. Later we will generalize it on a bunch of things. @@ -367,7 +365,7 @@ Certain thing should always work, such as the following: We assume these statements are true, but it's not true. -There are exceptions whic cause enourmous amounts of harm +There are exceptions which cause enormous amounts of harm and break all the laws of equality and ordering. double x(0.0/0.0); @@ -382,7 +380,7 @@ and is utterly invisible to you. The problem actually appears when people do complicated things. Perhaps they do millions of computations and then sort them. Sort assumes that equality and inequality work like -they should and bad things happen. +they should and bad things happen. The [IEEE floating point standard][float] is one of the great accomplishments of computer science. One of the top five. @@ -411,14 +409,11 @@ There are two solutions I'm advocating the second one. We keep the laws and define singularities. If there are singular values, the universe collapses, -you know nothing applies. -you have to assure that singular values do not appear in your computation. +you know nothing applies. +You have to assure that singular values do not appear in your computation. [float]: https://en.wikipedia.org/wiki/IEEE_754 ## Final code - [minmax.h](code/minmax.h) - - - diff --git a/07_min_range.html b/07_min_range.html index 5cb1d35..0eb5b7f 100644 --- a/07_min_range.html +++ b/07_min_range.html @@ -2,116 +2,9 @@ + 7. Minimum selection on ranges - + @@ -190,9 +83,9 @@

    Iterator conventions

    Notice that min_element doesn’t return the value itself, but the iterator pointing to the minimum element. Why? -We probably want to update the value.

    +Because we probably want to update the value.

    -

    Suppose, I’m a manager I want to look for the +

    Suppose I’m a manager and I want to look for the worst-performing guy and then fire him (joke). I don’t want his value, I want a handle on him. I want an iterator. @@ -202,13 +95,13 @@

    Iterator conventions

    There is another reason. The range might be empty1. -In which case we return last iterator.

    +In which case we return the last iterator.

    first and last are maybe bad names, but that’s what they are. They are hard to change, because I called them that everywhere. -last actual doesn’t mean last. -last means one after the last element. -In order to define a sequence you need to point past the last. +last actually doesn’t mean last. +last means one after the last element2. +In order to define a sequence you need to point past the last. Because you want to able to work with empty ranges.

            a b c d
    @@ -229,11 +122,11 @@ 

    Iterator conventions

    But there may be no good, or no bad elements. We need to able to return an empty range.

    -

    In C++ this is a standard convention some +

    In C++ this is a standard convention. Some people in the world of Java and Python are slowly realizing that maybe it -has something to do with mathematics not with C++. -But, it will take decades before people fully realize that you have to always go -passed the end. +has something to do with mathematics and not with C++. +But it will take decades before people fully realize that you have to always go +past the end. Mathematically you need semi-open intervals. They are denoted like so:

    @@ -255,9 +148,9 @@

    Forward iterator

    as opposed to random jumps. What is an InputIterator? Input iterators describe algorithms -which go through the river once2. -It’s like they’re reading stuff from the wire3. -Imagine the various kinds of streams4.

    +which go through the river once3. +It’s like they’re reading stuff from the wire4. +Imagine the various kinds of streams5.

    But, in our algorithm we store a previous position in the variable min_el. Therefore the things which go through the wire will not work. @@ -266,25 +159,25 @@

    Forward iterator

    Finding min and max together

    -

    How many comparisons do we need to to find the min of five elements? +

    How many comparisons do we need to find the min of five elements? Four. In general why do we need n - 1 comparisons. Why no more? We don’t need to compare an element with itself. Why no fewer? Maybe we could do it -in n - 2 with a clever algorithm. +in n - 2 with a clever algorithm? The simple argument to remember is that n - 1 guys have to lose. We’re finding a winner in our competition. -If person didn’t play in a competition, if he didn’t lose, +If a person didn’t play in a competition, if he didn’t lose, we cannot eliminate him. We need to eliminate all but one.

    How many comparisons if we need to find minimum and maximum together? Obviously we could do 2n - 2, -what about fewer5? +but what about fewer6? The idea is very simple. -Assume that we worked up to the middle of +Assume that we worked up to the middle of of our range and we have a running min and a running max. The temptation is take the next element compare him with min and with max. That’s very sad because very often we will do two comparisons and then discount the other. @@ -378,7 +271,7 @@

    Finding min and max together

    auto pair = minmax_element(a, a + n, std::less<int>());
    -

    This algorithm was invented by Ira Pohl6 of UC Santa Cruz, in the mid-seventies. +

    This algorithm was invented by Ira Pohl of UC Santa Cruz, in the mid-seventies7. He also proved it was optimal. It’s also practically good. It was added to the standard in C++11.

    @@ -398,41 +291,54 @@

    Code

    Many languages struggle with empty ranges because they prefer their algorithms to work with values instead of iterators. For example, they might return nil or a boolean which indicates -whether the value was found or not.

    +whether the value was found or not, or they may throw an exception.

    -

    Of course, unless they are dealing with a nice language, this adds a few lines of code, -or the possiblity that you forget to check for the nil case, and perhaps +

    Of course, this adds a few lines of code, +or the possibility that you forget to check for the nil case, and perhaps restricts the algorithm to reference types only.

  • -“No man ever steps in the same river twice.” - Heraclitus
  • +In “Elements of Programming” +the term “limit” is used instead of “last” to avoid this confusion. +It also happens to start with the same letter, so +the abbrevation [f, l) can be read as either word.
  • +“No man ever steps in the same river twice.” - Heraclitus
  • +
  • “Through the wire” refers to receiving data from a communication device, such as the internet. You read packets as they arrive, and they aren’t stored anywhere, -so once you read them, they are gone.
  • -
  • -

    Another example of the InputIterator concept is UNIX pipes. -Consider the command:

    +so once you read them, they are gone.
  • +
  • +

    Another example of the InputIterator concept is UNIX pipes. +Pipes transfer data which is output from one program, to the input of another. +Consider the following shell command:

    -
    head -c 500 /dev/urandom | gzip
    +
    head -c 50000 /dev/urandom | gzip
     
    -

    Head will read 500 random characters, but that data isn’t stored anywhere, -it’s immediately passed to stdout, -which is stdin for gzip. -gzip reads data from stdin. -It can only read data as it comes in. -It can’t seek back earlier in the input. +

    head reads 50000 random characters, +and immediately outputs it (to stdout). +This output then becomes the input for gzip (stdin) +which compresses it.

    + +

    The two programs run concurrently not sequentially. +When head reads a small chunk of data, it can be immediately +written to gzip. +In this way, the data transfer operates like InputIterator. +Neither program has access to all the data at once. +They can only read pieces of data as they come in. +They can’t seek back earlier in the input. Once it is read, it is gone.

    -

    This is one of the reasons why gzip is so useful. -It compresses data on the fly, without being able to see the incoming data, -or re-analyzing what came before.

  • -
  • -First time find the min (n - 1 comparisons). Then find the max (n - 1 comparisons). - (n - 1) + (n - 1) = 2n - 2.
  • +

    The ability of gzip to operate on “input iterator”-like +streams is one of the reasons why it is so versatile. +It can compress data while it is being generated, or downloaded, +without being able to see the complete data.

  • -Alex has written more about Ira and this algorithm here
  • +First find the min (n - 1 comparisons), then find the max (n - 1 comparisons). + (n - 1) + (n - 1) = 2n - 2. +
  • +For more about this algorithm, see Alex’s presentation “One algorithm from The Book: A tribute to Ira Pohl”.
  • diff --git a/07_min_range.md b/07_min_range.md index d5c564b..2bcc235 100644 --- a/07_min_range.md +++ b/07_min_range.md @@ -61,9 +61,9 @@ We need to talk about iterator conventions. Notice that `min_element` doesn't return the value itself, but the iterator pointing to the minimum element. Why? -We probably want to update the value. +Because we probably want to update the value. -Suppose, I'm a manager I want to look for the +Suppose I'm a manager and I want to look for the worst-performing guy and then fire him (joke). I don't want his value, I want a handle on him. I want an iterator. @@ -73,13 +73,13 @@ things with what you find. There is another reason. The range might be empty[^emptyrange]. -In which case we return last iterator. +In which case we return the last iterator. `first` and `last` are maybe bad names, but that's what they are. They are hard to change, because I called them that everywhere. -`last` actual doesn't mean last. -`last` means one after the last element. -In order to define a sequence you need to point past the last. +`last` actually doesn't mean last. +`last` means one after the last element[^last-called-limit]. +In order to define a sequence you need to point past the last. Because you want to able to work with empty ranges. @@ -91,8 +91,6 @@ Suppose `last` is actually last and you point to the same place. That indicates a range of one element. There is no way to do zero. - - Later, we will look at algorithms for partitioning. We want to partition good people from bad people. After the partition we will return a pointer which separates good from bad. @@ -102,11 +100,11 @@ the first bad and first good. But there may be no good, or no bad elements. We need to able to return an empty range. -In C++ this is a standard convention some +In C++ this is a standard convention. Some people in the world of Java and Python are slowly realizing that maybe it -has something to do with mathematics not with C++. -But, it will take decades before people fully realize that you have to always go -passed the end. +has something to do with mathematics and not with C++. +But it will take decades before people fully realize that you have to always go +past the end. Mathematically you need [semi-open intervals][half-open]. They are denoted like so: @@ -119,12 +117,17 @@ Or in our terms: [^emptyrange]: Many languages struggle with empty ranges because they prefer their algorithms to work with values instead of iterators. For example, they might return `nil` or a boolean which indicates - whether the value was found or not. + whether the value was found or not, or they may throw an exception. - Of course, unless they are dealing with a nice language, this adds a few lines of code, - or the possiblity that you forget to check for the nil case, and perhaps + Of course, this adds a few lines of code, + or the possibility that you forget to check for the nil case, and perhaps restricts the algorithm to reference types only. +[^last-called-limit]: In "Elements of Programming" + the term "limit" is used instead of "last" to avoid this confusion. + It also happens to start with the same letter, so + the abbrevation `[f, l)` can be read as either word. + [half-open]: https://mathworld.wolfram.com/Half-ClosedInterval.html @@ -153,46 +156,55 @@ So it must be a `ForwardIterator`. so once you read them, they are gone. [^stdin]: - Another example of the `InputIterator` concept is UNIX pipes. - Consider the command: - - head -c 500 /dev/urandom | gzip - - Head will read 500 random characters, but that data isn't stored anywhere, - it's immediately passed to `stdout`, - which is `stdin` for [gzip][gzip]. - gzip reads data from `stdin`. - It can only read data as it comes in. - It can't seek back earlier in the input. + Another example of the `InputIterator` concept is [UNIX pipes][unix-pipes]. + Pipes transfer data which is output from one program, to the input of another. + Consider the following shell command: + + head -c 50000 /dev/urandom | gzip + + `head` reads 50000 random characters, + and immediately outputs it (to `stdout`). + This output then becomes the input for `gzip` (`stdin`) + which compresses it. + + The two programs run concurrently not sequentially. + When `head` reads a small chunk of data, it can be immediately + written to `gzip`. + In this way, the data transfer operates like `InputIterator`. + Neither program has access to all the data at once. + They can only read pieces of data as they come in. + They can't seek back earlier in the input. Once it is read, it is gone. - This is one of the reasons why gzip is so useful. - It compresses data on the fly, without being able to see the incoming data, - or re-analyzing what came before. + The ability of `gzip` to operate on "input iterator"-like + streams is one of the reasons why it is so versatile. + It can compress data while it is being generated, or downloaded, + without being able to see the complete data. [gzip]: https://linux.die.net/man/1/gzip +[unix-pipes]: https://en.wikipedia.org/wiki/Pipeline_(Unix) ## Finding min and max together -How many comparisons do we need to to find the min of five elements? +How many comparisons do we need to find the min of five elements? Four. In general why do we need `n - 1` comparisons. Why no more? We don't need to compare an element with itself. Why no fewer? Maybe we could do it -in `n - 2` with a clever algorithm. +in `n - 2` with a clever algorithm? The simple argument to remember is that `n - 1` guys have to lose. We're finding a winner in our competition. -If person didn't play in a competition, if he didn't lose, +If a person didn't play in a competition, if he didn't lose, we cannot eliminate him. We need to eliminate all but one. How many comparisons if we need to find minimum and maximum together? Obviously we could do `2n - 2`, -what about fewer[^minmax]? +but what about fewer[^minmax]? The idea is very simple. -Assume that we worked up to the middle of +Assume that we worked up to the middle of of our range and we have a running min and a running max. The temptation is take the next element compare him with min and with max. That's very sad because very often we will do two comparisons and then discount the other. @@ -284,7 +296,7 @@ Example: size_t n = sizeof(a) / sizeof(int); auto pair = minmax_element(a, a + n, std::less()); -This algorithm was invented by [Ira Pohl][pohl][^pohl] of UC Santa Cruz, in the mid-seventies. +This algorithm was invented by [Ira Pohl][pohl] of UC Santa Cruz, in the mid-seventies[^pohl]. He also proved it was optimal. It's also practically good. It was [added to][cpp-minmax] the standard in C++11. @@ -298,8 +310,8 @@ It was [added to][cpp-minmax] the standard in C++11. [pohl]: https://users.soe.ucsc.edu/~pohl/bio.htm [cpp-minmax]: https://en.cppreference.com/w/cpp/algorithm/minmax_element -[^minmax]: First time find the min (`n - 1` comparisons). Then find the max (`n - 1` comparisons). +[^minmax]: First find the min (`n - 1` comparisons), then find the max (`n - 1` comparisons). `(n - 1) + (n - 1) = 2n - 2`. -[^pohl]: Alex has written more about Ira and this algorithm [here](http://stepanovpapers.com/IraPohlFest.pdf) +[^pohl]: For more about this algorithm, see Alex's presentation ["One algorithm from The Book: A tribute to Ira Pohl"](http://stepanovpapers.com/IraPohlFest.pdf). diff --git a/08_lisp.html b/08_lisp.html index 23cf2eb..c5eb2ff 100644 --- a/08_lisp.html +++ b/08_lisp.html @@ -2,116 +2,9 @@ + 8. Lisp-like lists - + @@ -133,12 +26,11 @@

    8. Lisp-like lists

    Lists in lisp and Scheme

    -

    A long time ago there was a programming language called Lisp1 +

    A long time ago there was a programming language called Lisp or for you younger folks Scheme. -Scheme might have been wrong, but it was great. +Scheme might have been wrong, but it was great1. The whole language centers around very simple linked lists - which are based on three fundamental -operations2:

    +which are based on three fundamental operations2:

    1. cons: create a pair.
    2. @@ -155,10 +47,7 @@

      Lists in lisp and Scheme

      all the algorithms we want to use. So we are going to add a 4th operation:

      -
        -
      1. free: manually release/free a pair.
      2. -
      - +

       4. free: manually release/free a pair.

      What we want to do is muck around with lists. Meaning you can insert items in the middle, change pointers, connect this and that. @@ -169,9 +58,9 @@

      Lists in lisp and Scheme

      How are we going to do that? You want to avoid memory fragmentation. If you have lists -with nodes spread all over memory, every time -you access one, it is a cache mich. -Modern computers caches do not really help if you do long jumps. +with nodes spread all over memory, every time +you access one, it is a cache miss. +Modern computer caches do not really help if you do long jumps. We have lots of nodes, but we want them to live in a little buffer even if we keep generating them back and forth. If they reside in a small space we will never get a cache miss.

      @@ -187,24 +76,24 @@

      Why is malloc so slow?

      So for any data structure of nodes, such as list, I would keep a pool of nodes myself and manage them in a quick way.

      -

      A few people, such as Bill Plauga3 at Microsoft +

      A few people, such as Bill Plauger at Microsoft and others at GNU who followed their example, said that if they have a common pool and they just do pointer movement then if you have multiple threads you -could have problems4. +could have problems3. Instead of solving the problem for the multi-threaded case they decided to solve it in general. -They said, “first we’re going to put locks on our malloc5. +They said, “first we’re going to put locks on our malloc4. Then we’re going to throw Alex’s pool management away and we’re going to do full malloc.” Now malloc is function call with a lock, so it’s a very heavy operation.

      -

      Because of this decision, all our lists are going to be thread safe. +

      Because of this decision, all our lists are going to be thread safe5. People like us, who do not use threads (you don’t use threads right?) pay for them. They violated a fundamental principle which Bjarne insists on. People should not pay for things they do not use. Everybody pays for the ability of multiple threads to do list allocations out of the -same pool, which actually nobody does but everybody pays.

      +same pool. Which actually nobody does, but everybody pays.

      List pool

      @@ -213,20 +102,19 @@

      List pool

      with many outstanding lists inside. Internally we will use one vector to implement many, many, lists. These lists are not containers. -A container guarantees that when a container is gone, the values are gone too. +A container guarantees that when the container is gone, the values are gone too. For these lists there is no guarantee like that. For example, you could split this list into two by setting a cdr. There is no ownership and this is why I recommend not viewing them as containers. -STL containers are wonderful, when you want them, but that’s not the case.

      +STL containers are wonderful when you want them, but that’s not the case here.

      We’re trying to get as close to Lisp as we can without building garbage collection6. If you want to build garbage collection you can extend this thing and build garbage collection too, but garbage collection is overrated.

      -

      Implement it as a class, -we will have two types. +

      We will implement list_pool as a class, with two types as template arguments. T will be the values we want to store, and N will be an index type.

      @@ -235,33 +123,55 @@

      List pool

      // T is semi-regular. // N is integral class list_pool { + typedef N list_type; + + struct node_t { + T value; + N next; + }; + + std::vector<node_t> pool; + list_type free_list; + // ... };
    -

    Now we are going to implement cons, car, cdr, and free, but -we need appropriate names for a younger generation.

    +

    What should N be? Why not size_t? +Because it’s 64 bits. +For our application we could probably +use uint16_t so our whole node fits in 32 bits. +But, we should define a default.

    + +
    typename N = size_t;
    +
    + +

    Now we are going to implement cons, car, cdr, and free as member +functions of list_pool, but we need appropriate names for a younger generation.

    - -

    Car

    + +

    Value (car)

    We will rename car to value. -Actually, it won’t just be car, it will -also act as rplaca (set car).

    +Note that because we can return value by reference, +it can be both read and modified. +So, it won’t just be car, it will +also act as rplaca7.

    T& value(list_type x) {
       return node(x).value;
     }
     
    -const T& value(list_type x) cons {
    -  reutrn node(x).value;
    +const T& value(list_type x) const {
    +  return node(x).value;
     }
     
    - -

    Cdr

    + +

    Next (cdr)

    -

    Similarly, we want cdr and rplacd.

    +

    Let’s rename cdr to next. +Because of read and write it also acts as rplacd.

    list_type& next(list_type x) {
       return node(x).next;
    @@ -276,8 +186,8 @@ 

    Cdr

    Free

    Now let’s write free. -We can make it somewhat more useful by returning something other than void. -Return the next, otherwise the user will have to save it before freeing.

    +The pool maintains a list of nodes which are available for reuse. +This operation appends a node to the head of this list8.

    list_type free(list_type x) {
       list_type cdr = next(x);
    @@ -287,16 +197,16 @@ 

    Free

    }
    -

    This is the same as (setf (cdr x) free-list) in Lisp or (set-cdr! x free-list) -in Scheme.

    +

    We make it somewhat more useful by returning next(x) instead of void. +If it was not returned, the user would have to save it before freeing.

    - -

    Cons

    + +

    Allocate (cons)

    Now we will write cons, it takes two arguments. Where do nodes come from? The free list, if it has room, -otherwise we made a new node from the pool.

    +otherwise we make a new node from the pool.

    list_type allocate(const T& val, list_type tail) {
       list_type new_list = free_list;
    @@ -314,7 +224,7 @@ 

    Cons

    }
    -

    So we need to write the public function is_empty and the private one new_node.

    +

    So we need to write the public function is_empty.

    bool is_empty(list_type x) const {
       return x == empty();
    @@ -324,7 +234,7 @@ 

    Cons

    Dual to this function, is one which gives you the nil or empty list.

    list_type empty() {
    -  return listp_type(0);
    +  return list_type(0);
     }
     
    @@ -333,24 +243,15 @@

    Cons

    the first item. If you use -1 then our index type must be signed.

    -
    typedef N list_type;
    -
    -list_pool() {
    +
    list_pool() {
       free_list = empty();
     }
     
    -

    Let’s write the class and private stuff now:

    +

    Now we write a few private node functions including new_node.

    -
    struct node_t {
    -  T value;
    -  N next;
    -};
    -
    -std::vector<node_t> pool.
    -
    -node_t& node(list_type x) {
    -  reutrn pool[x - 1];
    +
    node_t& node(list_type x) {
    +  return pool[x - 1];
     }
     
     const node_t& node(list_type x) const {
    @@ -368,22 +269,12 @@ 

    Cons

    Typically const is just for handing someone something to read.

    -

    What should N be. Why not size_t? -Because it’s 64 bits. -For our application we could probably -use uint16 so our whole node -fits in 32 bits. -But, we should define a default.

    - -
    typename N = size_t
    -
    -

    Free list helper

    There is a simple rule to distinguish when you should write a method/member function -and what to just make an outside function. +and when to just make an outside function (free function). Implement the simplest possible thing. If you can do it outside, do it.

    @@ -397,6 +288,10 @@

    Free list helper

    }
    +

    Exercise: Before moving on, get familiar with these operations. + Create a simple list inside a pool and print it by iterating through its contents + (solved in test_list_pool.cpp at the end of the chapter).

    +

    List queue

    @@ -414,14 +309,6 @@

    List queue

    pair_type empty_queue() { return pair_type(end(), end()); }
    -

    You can remove an element from the front of the queue:

    - -
    pair_type pop_front(const pair_type& p) {
    -  if (empty(p)) return p;
    -  return pair_type(next(p.first), p.second);
    -}
    -
    -

    You can add an element to the front, or the back of the queue:

    pair_type push_front(const pair_type& p, const T& value) {
    @@ -438,6 +325,14 @@ 

    List queue

    }
    +

    You can remove an element from the front of the queue9:

    + +
    pair_type pop_front(const pair_type& p) {
    +  if (empty(p)) return p;
    +  return pair_type(next(p.first), p.second);
    +}
    +
    +

    Now we can also free lists in constant time, simply by attaching the end of our list to the free list.

    @@ -449,6 +344,8 @@

    Code

    @@ -456,70 +353,76 @@

    Code

    1. Alex: I’m talking to an apparently non-existent Lisp community -because MIT is just a Python school now.
    2. +because MIT is just a Python school now +(see “Programming by poking: why MIT stopped teaching SICP”).
    3. -

      Alex call’s these “lists” without much explanation. -In Lisp all lists are built out of these pairs. -The car (first element) is the value of the list at this point. -The cdr (second element) points to another pair, or nil. -nil terminations the list.

      +

      Alex calls these “lists” without much explanation. +In Lisp all lists are built out of pairs. +In each pair, the first element (called the car) is the value of the list at this point. +The second element (called the cdr) is a pointer to another pair, or nil. +nil terminates the list.

      -

      For example the list (1 2 3) is represented by

      +

      For example, +if we write a pair as (car . cdr) with . deliminating the car and cdr, +the list 1 2 3 can be constructed from three pairs:

      -
      (1, -)---> (2, -)--->(3, -)-->nil
      +
      (1 . (2 . (3 . nil)))
       

      See chapter 2.2 of “Structure and Interpretation of Computer Programs” -to learn more.

    4. +for a thorough introduction to Lisp lists.

      + +

      car and cdr are commonly called head and tail in other functional languages. +Their names are historical artifacts of the hardware that early Lisp implementations used +(see “CAR and CDR Wikipedia page” or “Lisp 1.5 Programmer’s Manual”).

    5. -I cannot find any reference to this person or materials related -to this discussion. Please contact me if you know. -In this C++ Blog -Microsoft appears to be taking the position in agreement with Alex, -that locking on data structures is not a sufficient -approach to concurrent programming.
    6. -
    7. All kinds of problems can arise from two threads modifying the same resource. -The basic problem is that you can no longer reason about control flow in your code. +When code executes concurrently, it’s much more difficult to reason about control flow. One line does not immediately follow the other, -so things can be overwritten or messed up in betwween statements. +so things can be overwritten or messed up in between statements. Another problem is called a race condition. -This is when a piece of code relies on one thread doing a task before another,
    8. -
    9. -

      Lock’s (often called mutex in programming) +This is when a piece of code relies on one thread doing a task before another.

    10. +
    11. +

      Locks (often called mutexes in programming) are a mechanism for controlling access to a shared resource. -If you want to avoid threads messing up each others work, -you ensure that only one thread is allowed to modify the resource at a time. -Designing such a mechanism is actually fairly difficult. -(See “The Art of Multiprocesser Programming” By Herlihy and Shavit.)

      - -

      They tend to be slow because they either require waiting in a loop “spin lock” -or communicating with the kernel scheduler.

      - -

      Many programming projects in the late 90s and early 2000s (especially Java and C#) -decided -that the way to support multithreading programming was to ensure exclusive -access to most every resources, as if one would write code -using threads all over the place. +To prevent multiple threads from running over each other, +a lock ensures that only one thread can access or modify +a shared resource at a time. +Designing such a mechanism well is actually fairly difficult. +(See “The Art of Multiprocesser Programming” by Herlihy and Shavit.) +Locks tend to be slow because they pause threads until they are safe to proceed. +In addition they usually communicate with the kernel.

      + +

      Many programming frameworks in the late 90s and early 2000s (especially Java and C#) +decided that the way to support multithreaded programming was to protect +every resource with locks, +as if programs should share class instances +across threads haphazardly. This trend is reflected in Alex’s story.

      -

      The error prone nature of concurrency and parallelism has led to more disciplined design -and tools. -Often portions portions of the program are explicity dedicated to a certain thread -and communication is carefully controlled. -Another trend is to use -threads in a functional manner, invoking them to do a bunch of work, without external -state, and returning a single result.

      - -

      Based on Alex’s comments we can assume he -would support using multiple processes instead of threads. -Processes offer memory protection by default, and then allow -you to expose dangerous shared portions for communication.

    12. +

      Since then, the error prone nature of concurrency and parallelism +has encouraged more disciplined design and tools. +One approach is to organize the program architecture around a few specific threads running +for the duration of the program, with carefully controlled communication +protocols. +Another is to spawn threads only to compute pure functions, +which do not have shared resource problems.

      + +

      Based on Alex’s comments we can guess that +he would prefer processes to threads. +Processes offer memory protection by default, with all the danger +centralized in small shared portions. +(See chapter 7 of “The Art of UNIX Programming”)

      +
    13. +Although malloc may lock, according to +this article, +STL containers on Microsoft platforms do not attempt to +ensure thread safety with locks.
    14. A significant difference between Alex’s lists and those -in Lisp is that they are homogenous, +in Lisp is that they are homogeneous, they can only store one type of value. -In Lisp, hetrogenous lists are everywhere, +In Lisp, heterogeneous lists are everywhere, especially nested lists, which are what allow code to be written in a list format.

      @@ -537,6 +440,22 @@

      Code

      The complexity of allocating and managing memory for such structures was one of the motivations for inventing garbage collection.

    15. +
    16. +rplaca and rplacd are unfriendly abbrevations of +“replace car” and “replace cdr” (see “Lisp 1.5 Programmer’s Manual”). +They are low-level functions for manipulating pairs in lists. +In Scheme they correspond to set-car! and set-cdr!. +In Common Lisp one typically uses the higher-level macro setf for the same purpose.
    17. +
    18. +The assignment in this code is the same as (setf (cdr x) free-list) in Common Lisp, +or (set-cdr! x free-list) in Scheme.
    19. +
    20. +Since we implement pop_front, you might also expect pop_back. +A little thought will reveal there is no constant +time implementation for singly linked lists. +The queue has a reference to the last node in the queue, +but removing it would require modification +of the preceding node.
    diff --git a/08_lisp.md b/08_lisp.md index defa5a6..04cdaee 100644 --- a/08_lisp.md +++ b/08_lisp.md @@ -3,12 +3,11 @@ ## Lists in lisp and Scheme -A long time ago there was a programming language called [Lisp][lisp][^community] +A long time ago there was a programming language called [Lisp][lisp] or for you younger folks [Scheme][scheme]. -Scheme might have been wrong, but it was great. -The whole language centers around very simple [linked lists][linked] - which are based on three fundamental -operations[^sicp]: +Scheme might have been wrong, but it was great[^community]. +The whole language centers around very simple [linked lists][linked-list] +which are based on three fundamental operations[^sicp]: 1. [`cons`](http://www.lispworks.com/documentation/lw50/CLHS/Body/f_cons.htm): create a pair. 2. [`car`](http://clhs.lisp.se/Body/f_car_c.htm#car): get first element of pair. @@ -22,8 +21,7 @@ We don't want garbage collection for all the algorithms we want to use. So we are going to add a 4th operation: -4. `free`: manually release/free a pair. - + 4. `free`: manually release/free a pair. What we want to do is muck around with lists. Meaning you can insert items in the middle, change pointers, connect this and that. @@ -34,30 +32,36 @@ But, we're going to build it so it's blindingly fast. How are we going to do that? You want to avoid memory fragmentation. If you have lists -with nodes spread all over memory, every time -you access one, it is a cache mich. -Modern computers caches do not really help if you do long jumps. +with nodes spread all over memory, every time +you access one, it is a cache miss. +Modern computer caches do not really help if you do long jumps. We have lots of nodes, but we want them to live in a little buffer even if we keep generating them back and forth. If they reside in a small space we will never get a cache miss. -[^sicp]: Alex call's these "lists" without much explanation. - In Lisp all lists are built out of these pairs. - The `car` (first element) is the value of the list at this point. - The `cdr` (second element) points to another pair, or `nil`. - `nil` terminations the list. - - For example the list `(1 2 3)` is represented by +[^sicp]: Alex calls these "lists" without much explanation. + In Lisp all lists are built out of pairs. + In each pair, the first element (called the `car`) is the value of the list at this point. + The second element (called the `cdr`) is a pointer to another pair, or `nil`. + `nil` terminates the list. - (1, -)---> (2, -)--->(3, -)-->nil + For example, + if we write a pair as `(car . cdr)` with `.` deliminating the `car` and `cdr`, + the list `1 2 3` can be constructed from three pairs: + (1 . (2 . (3 . nil))) See [chapter 2.2][sicp] of "Structure and Interpretation of Computer Programs" - to learn more. + for a thorough introduction to Lisp lists. + + `car` and `cdr` are commonly called `head` and `tail` in other functional languages. + Their names are historical artifacts of the hardware that early Lisp implementations used + (see ["CAR and CDR Wikipedia page"][car-and-cdr] or "Lisp 1.5 Programmer's Manual"). [^community]: Alex: I'm talking to an apparently non-existent Lisp community - because MIT is just a [Python school now](http://lambda-the-ultimate.org/node/5335). + because MIT is just a Python school now + (see ["Programming by poking: why MIT stopped teaching SICP"][programming-by-poking]). [gc]: https://en.wikipedia.org/wiki/Garbage_collection_(computer_science) @@ -65,7 +69,9 @@ If they reside in a small space we will never get a cache miss. [sicp]: https://mitpress.mit.edu/sites/default/files/sicp/full-text/book/book-Z-H-15.html#%_sec_2.2 [lisp]: https://en.wikipedia.org/wiki/Lisp_(programming_language) [scheme]: https://en.wikipedia.org/wiki/Scheme_(programming_language) -[linked]: https://en.wikipedia.org/wiki/Linked_list +[linked-list]: https://en.wikipedia.org/wiki/Linked_list +[car-and-cdr]: https://en.wikipedia.org/wiki/CAR_and_CDR +[programming-by-poking]: http://lambda-the-ultimate.org/node/5335 ### Why is malloc so slow? @@ -77,7 +83,7 @@ based data structures. So for any data structure of nodes, such as list, I would keep a pool of nodes myself and manage them in a quick way. -A few people, such as Bill Plauga[^unknown] at Microsoft +A few people, such as [Bill Plauger][pj-plauger] at Microsoft and others at [GNU][gnu] who followed their example, said that if they have a common pool and they just do pointer movement then if you have multiple threads you could have problems[^race]. @@ -88,58 +94,58 @@ Then we're going to throw Alex's pool management away and we're going to do full malloc." Now malloc is function call with a lock, so it's a very heavy operation. -Because of this decision, all our lists are going to be thread safe. +Because of this decision, all our lists are going to be thread safe[^microsoft-thread-safe]. People like us, who do not use threads (you don't use threads right?) pay for them. They violated a fundamental principle which Bjarne insists on. *People should not pay for things they do not use*. Everybody pays for the ability of multiple threads to do list allocations out of the -same pool, which actually nobody does but everybody pays. +same pool. Which actually nobody does, but everybody pays. -[^unknown]: I cannot find any reference to this person or materials related - to this discussion. Please contact me if you know. - In this [C++ Blog](https://devblogs.microsoft.com/cppblog/concurrent-containers/) - Microsoft appears to be taking the position in agreement with Alex, - that locking on data structures is not a sufficient - approach to concurrent programming. +[pj-plauger]: https://en.wikipedia.org/wiki/P._J._Plauger +[^microsoft-thread-safe]: Although `malloc` may lock, according to + [this article](https://devblogs.microsoft.com/cppblog/concurrent-containers/), + STL containers on Microsoft platforms do not attempt to + ensure thread safety with locks. + [^race]: All kinds of problems can arise from two threads modifying the same resource. - The basic problem is that you can no longer reason about control flow in your code. + When code executes concurrently, it's much more difficult to reason about control flow. One line does not immediately follow the other, - so things can be overwritten or messed up in betwween statements. + so things can be overwritten or messed up in between statements. Another problem is called a [race condition][race]. - This is when a piece of code relies on one thread doing a task before another, + This is when a piece of code relies on one thread doing a task before another. -[^lock]: [Lock's][lock] (often called mutex in programming) +[^lock]: [Locks][lock] (often called mutexes in programming) are a mechanism for controlling access to a shared resource. - If you want to avoid threads messing up each others work, - you ensure that only one thread is allowed to modify the resource at a time. - Designing such a mechanism is actually fairly difficult. - (See "The Art of Multiprocesser Programming" By Herlihy and Shavit.) - - They tend to be slow because they either require waiting in a loop "spin lock" - or communicating with the kernel scheduler. - - Many programming projects in the late 90s and early 2000s (especially Java and C#) - decided - that the way to support multithreading programming was to ensure exclusive - access to most every resources, as if one would write code - using threads all over the place. + To prevent multiple threads from running over each other, + a lock ensures that only one thread can access or modify + a shared resource at a time. + Designing such a mechanism well is actually fairly difficult. + (See "The Art of Multiprocesser Programming" by Herlihy and Shavit.) + Locks tend to be slow because they pause threads until they are safe to proceed. + In addition they usually communicate with the kernel. + + Many programming frameworks in the late 90s and early 2000s (especially Java and C#) + decided that the way to support multithreaded programming was to protect + every resource with locks, + as if programs should share class instances + across threads haphazardly. This trend is reflected in Alex's story. - The error prone nature of concurrency and parallelism has led to more disciplined design - and tools. - Often portions portions of the program are explicity dedicated to a certain thread - and communication is carefully controlled. - Another trend is to use - threads in a functional manner, invoking them to do a bunch of work, without external - state, and returning a single result. - - Based on Alex's comments we can assume he - would support using multiple processes instead of threads. - Processes offer memory protection by default, and then allow - you to expose dangerous shared portions for communication. - + Since then, the error prone nature of concurrency and parallelism + has encouraged more disciplined design and tools. + One approach is to organize the program architecture around a few specific threads running + for the duration of the program, with carefully controlled communication + protocols. + Another is to spawn threads only to compute pure functions, + which do not have shared resource problems. + + Based on Alex's comments we can guess that + he would prefer processes to threads. + Processes offer memory protection by default, with all the danger + centralized in small shared portions. + (See chapter 7 of "The Art of UNIX Programming") [gnu]: https://www.gnu.org/ [lock]: https://en.wikipedia.org/wiki/Lock_(computer_science) @@ -151,12 +157,12 @@ A list pool is an object with many outstanding lists inside. Internally we will use one vector to implement many, many, lists. These lists are not containers. -A container guarantees that when a container is gone, the values are gone too. +A container guarantees that when the container is gone, the values are gone too. For these lists there is no guarantee like that. For example, you could split this list into two by setting a `cdr`. There is no ownership and this is why I recommend not viewing them as containers. -STL containers are wonderful, when you want them, but that's not the case. +STL containers are wonderful when you want them, but that's not the case here. We're trying to get as close to Lisp as we can without building garbage collection[^difference]. If you want to build garbage collection you can extend @@ -165,9 +171,9 @@ is overrated. [^difference]: A significant difference between Alex's lists and those - in Lisp is that they are homogenous, + in Lisp is that they are homogeneous, they can only store one type of value. - In Lisp, hetrogenous lists are everywhere, + In Lisp, heterogeneous lists are everywhere, especially nested lists, which are what allow code to be written in a list format. @@ -186,8 +192,7 @@ is overrated. for inventing garbage collection. -Implement it as a class, -we will have two types. +We will implement `list_pool` as a class, with two types as template arguments. `T` will be the values we want to store, and `N` will be an index type. @@ -196,29 +201,50 @@ and `N` will be an index type. // T is semi-regular. // N is integral class list_pool { + typedef N list_type; + + struct node_t { + T value; + N next; + }; + + std::vector pool; + list_type free_list; + // ... }; -Now we are going to implement `cons`, `car`, `cdr`, and `free`, but -we need appropriate names for a younger generation. +What should `N` be? Why not `size_t`? +Because it's 64 bits. +For our application we could probably +use `uint16_t` so our whole node fits in 32 bits. +But, we should define a default. + + typename N = size_t; + +Now we are going to implement `cons`, `car`, `cdr`, and `free` as member +functions of `list_pool`, but we need appropriate names for a younger generation. -### Car +### Value (car) We will rename `car` to `value`. -Actually, it won't just be `car`, it will -also act as [`rplaca`][set-car] (set car). +Note that because we can return `value` by reference, +it can be both read and modified. +So, it won't just be `car`, it will +also act as [`rplaca`][rplaca][^rplaca-explanation]. T& value(list_type x) { return node(x).value; } - const T& value(list_type x) cons { - reutrn node(x).value; + const T& value(list_type x) const { + return node(x).value; } -### Cdr +### Next (cdr) -Similarly, we want `cdr` and [`rplacd`][set-car]. +Let's rename `cdr` to `next`. +Because of read and write it also acts as [`rplacd`][rplacd]. list_type& next(list_type x) { return node(x).next; @@ -228,11 +254,13 @@ Similarly, we want `cdr` and [`rplacd`][set-car]. return node(x).next; } +[rplacd]: http://clhs.lisp.se/Body/f_rplaca.htm + ### Free Now let's write `free`. -We can make it somewhat more useful by returning something other than void. -Return the next, otherwise the user will have to save it before freeing. +The pool maintains a list of nodes which are available for reuse. +This operation appends a node to the head of this list[^free-in-lisp]. list_type free(list_type x) { list_type cdr = next(x); @@ -241,15 +269,15 @@ Return the next, otherwise the user will have to save it before freeing. return cdr; } -This is the same as `(setf (cdr x) free-list)` in Lisp or `(set-cdr! x free-list)` -in Scheme. +We make it somewhat more useful by returning `next(x)` instead of void. +If it was not returned, the user would have to save it before freeing. -### Cons +### Allocate (cons) Now we will write `cons`, it takes two arguments. Where do nodes come from? The free list, if it has room, -otherwise we made a new node from the pool. +otherwise we make a new node from the pool. list_type allocate(const T& val, list_type tail) { list_type new_list = free_list; @@ -266,7 +294,7 @@ otherwise we made a new node from the pool. return new_list; } -So we need to write the public function `is_empty` and the private one `new_node`. +So we need to write the public function `is_empty`. bool is_empty(list_type x) const { return x == empty(); @@ -275,7 +303,7 @@ So we need to write the public function `is_empty` and the private one `new_node Dual to this function, is one which gives you the `nil` or empty list. list_type empty() { - return listp_type(0); + return list_type(0); } You might think, what about the `0`th item in the pool? @@ -283,23 +311,14 @@ We will just index everything at `1`, so we don't lose the first item. If you use `-1` then our index type must be signed. - typedef N list_type; - list_pool() { free_list = empty(); } -Let's write the class and private stuff now: - - struct node_t { - T value; - N next; - }; - - std::vector pool. +Now we write a few private node functions including `new_node`. node_t& node(list_type x) { - reutrn pool[x - 1]; + return pool[x - 1]; } const node_t& node(list_type x) const { @@ -316,20 +335,11 @@ or all of them to be non-`const`. Typically `const` is just for handing someone something to read. -What should `N` be. Why not `size_t`? -Because it's 64 bits. -For our application we could probably -use `uint16` so our whole node -fits in 32 bits. -But, we should define a default. - - typename N = size_t - ### Free list helper There is a simple rule to distinguish when you should write a method/member function -and what to just make an outside function. +and when to just make an outside function (free function). Implement the simplest possible thing. If you can do it outside, do it. @@ -343,6 +353,24 @@ not just a node. while (!pool.is_empty(x)) x = pool.free(x); } +**Exercise:** Before moving on, get familiar with these operations. + Create a simple list inside a pool and print it by iterating through its contents + (solved in `test_list_pool.cpp` at the end of the chapter). + + +[^rplaca-explanation]: `rplaca` and `rplacd` are unfriendly abbrevations of + "replace car" and "replace cdr" (see "Lisp 1.5 Programmer's Manual"). + They are low-level functions for manipulating pairs in lists. + In Scheme they correspond to `set-car!` and `set-cdr!`. + In Common Lisp one typically uses the higher-level macro [`setf`][setf] for the same purpose. + +[^free-in-lisp]: The assignment in this code is the same as `(setf (cdr x) free-list)` in Common Lisp, + or `(set-cdr! x free-list)` in Scheme. + +[rplaca]: http://clhs.lisp.se/Body/f_rplaca.htm +[setf]: http://www.lispworks.com/documentation/lw50/CLHS/Body/m_setf_.htm + + ## List queue We can use our list to implement a queue structure. @@ -357,13 +385,6 @@ and construct them: bool empty(const pair_type& p) { return is_end(p.first); } pair_type empty_queue() { return pair_type(end(), end()); } -You can remove an element from the front of the queue: - - pair_type pop_front(const pair_type& p) { - if (empty(p)) return p; - return pair_type(next(p.first), p.second); - } - You can add an element to the front, or the back of the queue: pair_type push_front(const pair_type& p, const T& value) { @@ -379,15 +400,29 @@ You can add an element to the front, or the back of the queue: return pair_type(p.first, new_node); } +You can remove an element from the front of the queue[^no-pop-back]: + + pair_type pop_front(const pair_type& p) { + if (empty(p)) return p; + return pair_type(next(p.first), p.second); + } + Now we can also free lists in constant time, simply by attaching the end of our list to the free list. void free(const pair_type& p) { free(p.first, p.second); } -[set-car]: http://clhs.lisp.se/Body/f_rplaca.htm -## Code +[^no-pop-back]: Since we implement `pop_front`, you might also expect `pop_back`. + A little thought will reveal there is no constant + time implementation for singly linked lists. + The queue has a reference to the last node in the queue, + but removing it would require modification + of the preceding node. -- [list_pool.h](code/list_pool.h) +## Code +- [list_pool.h](code/list_pool.h) +- [list_pool_iterator.h](code/list_pool_iterator.h) (included by `list_pool.h`, but not discussed until Chapter 9) +- [test_list_pool.cpp](code/test_list_pool.cpp) diff --git a/09_iterators.html b/09_iterators.html index f22cd19..e5aed0c 100644 --- a/09_iterators.html +++ b/09_iterators.html @@ -2,116 +2,9 @@ -# 9. Iterators - + +9. Iterators + @@ -140,7 +33,7 @@

    History of iterators

    and for all.

    Let me tell you a little about them. -Anybody who programs in C++ is forced to use vector +Anybody who programs in C++ is forced to use std::vector and those have iterators. But, there are lots of people who do not quite understand what they are, partially because iterators are called iterators. @@ -156,13 +49,9 @@

    History of iterators

    It’s a mythical language. It was never implemented. But, many people at CMU got tenure because of it. It has some interesting ideas, including the idea of a generator. -For those of you who know Python, it is like an iterator in Python. -Barabra Liskov said, wouldn’t it be nice to write:

    - -
    for x in thing
    -
    - -

    and iterators allow you to do that. +For those of you who know Python, it is like an iterator in Python1. +Barbara Liskov said, “wouldn’t it be nice to write something like: for x in thing”. +Iterators allow you to do that2. It is like a generator in Alphard. It is a procedure which returns multiple values, one by one. It’s a procedural abstraction. @@ -172,113 +61,103 @@

    History of iterators

    At the same time, I was working on how to do algorithms and I introduced the notion of position. -A better name is coordinate, the name which Paul and I are use -in our book “Elements of Programming”1. +A better name is coordinate, the name which Paul and I use +in our book “Elements of Programming”. A coordinate is some way of indicating where in the data structure you are. -It is not a control structure it’s just the pointer into a data structure, +It is not a control structure, it’s just the pointer into a data structure, or generalized notion of a coordinate. -It is something which allows me to navigate through to the data structure in a natural way the -data structure navigates.

    +It is something which allows me to navigate through the data structure in a +natural way3.

    Eventually I started talking to C++ guys and showed them coordinates and they said, “we call it iterator”. Of course they didn’t call what I had an iterator. They had some C++ code where they were trying to do CLU like iterators, heavy state procedural -things instead of lightweight pointer. +things instead of lightweight pointers. So, I chickened out. I said, “Yes, that’s exactly what you guys have”. I thought it was much better to win with the wrong name than lose with the right name, so the name is stuck. -It’s in the standard, but again the concept which designates is not a -concept of iterator in CLU or iterator in Python which is not iterator in Java. -An iterator is a generalization of a coordinate in a data structure. -It’s a lightweight thing, it doesn’t do anything. -It just points to things.

    - -

    This argument is active again. -There are boost guys -who say, “Iterator is all wrong. Let us, go back +It’s in the standard. But again, the concept which it designates is not a +concept of iterator in CLU or iterator in Python or iterator in Java. +Our iterator is a generalization of a coordinate in a data structure. +It’s a lightweight thing. It doesn’t do anything, +it just points to something.

    + +

    There are these arguments which I hear from people like the +Boost4 guys, who say “Iterators are all wrong. Let us go back and do ranges.” -They’re reinventing Barbara Liskov iterators. -Just for the record, when I introduced my iterators, -I was very familiar -with the iterators in CLU. - Moreover I was even very familiar with Barbara herself -and with Alan Snyder who designed iterators. +Guess what? They’re reinventing Barbara Liskov’s iterators. +Just for the record, when I introduced my iterators +I was very familiar with the iterators in CLU. +Moreover I was even very familiar with Barbara herself +and with Alan Snyder who co-designed the iterators in CLU. I didn’t do what they did because I wanted to do something else. It wasn’t out of ignorance. Maybe I was stupid, but I wasn’t ignorant.

    - -

    List pool iterators

    + +

    Affiliated types for iterators

    We always need to distinguish between how we type code in C++ and what the notion is behind it. A key part of iterators is affiliated types. Iterators point to values and you want to know what those values are. Types don’t work on their own. -They come in clusters, or connected families. -If you have int* there is an affiliated type int -which are related. -We want to be able to obtain int from int*. -Even in Python which has duck typing.

    +They come in clusters, or connected families of types. +If you have a type int*, there is an affiliated type int. +The two types are related. It would be terribly nice if we had a +way to obtain int from int*. That is, if somebody +gives me a pointer type, I want a way to find out what type +it points to.

    + +

    To do this we need this notion of type functions which accept one type +and return a different type. +This problem of needing to obtain affiliated types is not specific to C and C++. +It appears in Java and Python as well. +In spite of Python’s duck typing there’s still a connection +between types, even if they are duck types.

    + +

    So we need this notion of type functions, +but C doesn’t let us do this, +and neither does C++. +Instead of type functions we are going to solve +this problem for iterators by using typedefs.

    For an iterator, there are 5 types that we always need to define. -3 are primary, 2 are secondary. +Three of them are primary, two are secondary. Let’s start with the primary:

    1. value_type: the type of the value it points to.

    2. difference_type: Difference between pointers (ptrdiff_t in C). Unlike C we might need a different type. - The length between elements in a range, depends on the range type. + The length between elements in a range depends on the range type. It is an integral type large enough to encode any valid range of a given iterator.

    3. iterator_category: - Once again this is a general notion, not just C++. - There are ForwardIterators, RandomAccessIterators, - InputIterators, etc. - They are different theory of iterators.

      - -

      In C++ (without concepts) we use tag types. - Every iterator uses a tag type to signify what theory - it supports. - The tag lets you do compile type dispatch.

      - -

      What category is this iterator? - In the list pool, our iterator is a ForwardIterator. - Because, there is no way in a singly linked list - to go backward.

    4. + Once again we need to distinguish between how we type this in C++ and what + the notion behind it is. + The notion here is that there are different categories, or theories, + of iterators: ForwardIterators, RandomAccessIterators, + InputIterators, etc…

      + +

      In C++ (without concepts) we use tag types to designate the iterator categories. + Every iterator uses a tag type to denote which theory it supports. + The tag lets you do compile time dispatch5.

    -

    Let’s define these types in the list_pool -from last time.

    + +

    Historical artifacts

    -
    #include <iterator>
    -
    -// class list_pool {
    -// ...
    -
    -struct iterator {
    -  typedef list_pool::value_type value_type;
    -  typedef list_pool::list_type difference_type;
    -  typedef std::forward_iterator_tag iterator_category;
    -};
    -
    -// };
    -
    - - -

    Iterator reference types

    - -

    The 4th and 5th types to be defined are required only for historical reasons. +

    The fourth and fifth types to be defined are required only for historical reasons. Microsoft said they were going to vote against STL unless it accommodated multiple memory models. At the time, they had tiny pointers, huge pointers and -far pointers. +far pointers. They wanted STL to somehow work with all of them. I had to figure out how they work. @@ -286,15 +165,16 @@

    Iterator reference types

    weird. They are both 32 bits. But, with far pointer if you add -one to it, and the first low two bytes overflow, it rotates without propagation. -With a huge, it does propagated to the 16th bit.

    +one to it, and the two lowest bytes overflow, +they wrap without propagation to the upper bytes. +With a huge pointer, the carry is propagated to the upper bytes, but by adding +8 to them6.

    -

    So, they -they demanded that I change the whole architecture to accommodate them. +

    So they demanded that I change the whole architecture to accommodate them. Guess how they voted? No. Now we’re stuck for the next hundred years with stuff which was included to placate people that - couldn’t have been placated.

    +couldn’t have been placated.

    So what does an iterator return when you dereference it? Normally a reference. It’s an lvalue @@ -303,20 +183,35 @@

    Iterator reference types

    type should be. So now we need to provide it.

    -
      -
    1. reference: the type of a reference to the value.
    2. -
    3. pointer: the type of a pointer to the value.
    4. -
    +

     4. reference: the type of a reference to the value.
    + 5. pointer: the type of a pointer to the value.

    +

    It’s not particularly harmful, but it obfuscates things +and it provides “language experts” with steady employment.

    -

    Here is our implementation:

    + +

    List pool iterators

    -
    typedef value_type& reference;
    -typedef value_type* pointer;
    -
    +

    Let’s define these types in the list_pool iterator. +What category is the iterator for list_pool from last chapter? +We need ForwardIterator, because there is no way in a singly linked list +to go backwards.

    -

    It’s not particularly harmful, bu it obfuscates things -and it provides “language experts” with steady employment.

    +
    #include <iterator>
    +
    +// class list_pool {
    +// ...
    +
    +     struct iterator {
    +       typedef list_pool::value_type value_type;
    +       typedef list_pool::list_type difference_type;
    +       typedef std::forward_iterator_tag iterator_category;
    +       typedef value_type& reference;
    +       typedef value_type* pointer;
    +     };
    +
    +// };
    +

    Constructors

    @@ -324,7 +219,7 @@

    Constructors

    Let’s write constructors for our iterator:

    iterator() {}
    -iterator(list_pool& p, list_pool.::list_type node) :
    +iterator(list_pool& p, list_pool::list_type node) :
         pool(&p), node(node) {}
     
    @@ -366,10 +261,10 @@

    Dereference

    could be done differently. If you work in C, you write C.

    - -

    Preincrement, postincrement

    + +

    Pre-increment, post-increment

    -

    When you increment the iterator should +

    When you increment, the iterator should move to the next node.

    iterator& operator++() {
    @@ -387,7 +282,7 @@ 

    Preincrement, postincrement

    int here doesn’t do anything, it’s just to distinguish pre and post. -Preincrement could be automatically generated +Post-increment could be automatically generated and it would be criminal to do anything else.

    @@ -404,7 +299,7 @@

    Equality

    friend
     bool operator==(const iterator& x, const iterator& y) {
    -  // assert(x.pool === y.pool);
    +  // assert(x.pool == y.pool);
       return x.node == y.node;
     }
     
    @@ -418,7 +313,11 @@ 

    Equality

    for equality, as it violates mathematical tradition. But, oh well. When you took Algebra in grade school, -they use = and they use x and y and I think it’s good.

    +they used = and they used x and y and I think it’s good.

    + +

    Exercise: Experiment with list pool iterators by using + a standard library algorithm on them, such as find or copy + (see test_list_pool_iterator.cpp at the end of the chapter).

    Thoughts about iterator design

    @@ -428,10 +327,10 @@

    Should we add safety guards?

    Notice I sometimes write assertions in comments:

    -
    // assert(x.pool === y.pool);
    +
    // assert(x.pool == y.pool);
     
    -

    I don’t write it because it takes too long to check2. +

    I don’t use a real assert because it takes too long to check7. There is nobody who should be comparing iterators from separate pools. If he does, he deserves what he gets. But, wouldn’t it be good to guarantee safety? @@ -446,30 +345,30 @@

    Should we add safety guards?

    What about compiling debug mode? It’s mildly useful. The only truly useful thing is to decompose -your program into clear subroutines and unit +your program into clear subroutines and clear units which you understand. That’s the only thing I know that works.

    The reason to program in C++ is to have access to the machine. -We want pointers. +You want to have these unsafe things called pointers. If you want a language which hides the machine, -then use it. +then use it. It has its advantages. Python is good for writing scripts. But don’t write your operating system in Python. Don’t write your search engine. -BASIC and Cobol were wonderful for what they are. -I wouldn’t use Cobol to write an operating system.

    +BASIC and COBOL were wonderful for what they are. +I wouldn’t use COBOL to write an operating system.

    Why are forward iterators not comparable?

    What is the basic example of a ForwardIterator? A singly linked list. -Forward iterators do not have <. +Forward iterators do not have ‘less-than’ defined on them. Is that a good idea? You could do it for linked lists, but it’s very expensive -and not even guaranteed to terminate3. -But you’re assuming “If I am <, then I am before”. +and not even guaranteed to terminate8. +But you’re assuming “If I am less-than, then I am before”. This is only one interpretation. Let me explain.

    @@ -484,7 +383,7 @@

    Why are forward iterators not comparable?

    For example, suppose I want to see if an iterator is in the structure, such as inclusion in a set. The only effective way is comparison. -I was torn because I knew I could not include it, +I was torn because I knew I could not include it, because most people would attempt to write code in STL the old way they learned which is:

    @@ -511,46 +410,95 @@

    Everything on a computer is totally ordered

    It was an oversight on the part of C and still C++ again. The compiler should generate equality, and it could generate inequality using -lexographical ordering.

    +lexicographical ordering.

    So if you want to use iterators -on set you can define a custom comparator. -One, which compares the addresses of the elements to which iterator points. +on set, you can define a custom comparator. +One which compares the addresses of the elements to which the iterator points. It has nothing to do with before and after, but it establishes a total ordering. A total ordering does not have to be topologically induced by the traversal.

    -

    Exercise: extend the iterator to linked iterator - so we can assign the next on a node. - Specifically we want to be able to modify the successor of - an iterator.

    +

    Exercise: Extend the list pool iterator + with the ability to modify the next of the node it points to + (this is discussed and solved in chapter 12).

    Code


    1. -Alex recommends chapter 7 of “Elements of Programming” - on coordinate structures.
    2. +

      Alex almost certainly means Python generators, not iterators, but I will describe both.

      + +

      An iterator in Python is any object which implements a method called __next__(). +Unlike C++ iterators, the __next__() always returns another element of the sequence, not an iterator, +so they do not resemble pointers or coordinates, neither are they comparable. +They are most similar to InputIterator in that previous values in the sequence become inaccessible after advancing. +The only thing special about the __next__() method is its compatibility with language constructs like for loops.

      + +

      A generator in Python is a kind of iterator that is typically implemented +as a function with some helpful syntax additions. +In a generator function, the yield keyword is used to return the next value in the sequence. +If an additional value is requested after yielding, +the function will resume at the point of the previous call to yield. +This makes writing complex sequences more natural, as the control flow operates like other code. +For example, the following returns square numbers:

      + +
      def square_nums(count):
      +    k = 0
      +    while k < count:
      +        yield k * k
      +        k += 1
      +
      + +

      It can be used in a for loop:

      + +
      for x in square_nums(10):
      +    print(x)
      +# 0 1 4 ...
      +
    3. +See this brief description of CLU iterators.
    4. +
    5. +See chapter 7 of “Elements of Programming” on coordinate structures. +An interesting discussion on the general idea of “coordinatisation” +is found in chapter 1 of “Basic Notions of Algebra” by Shafarevich.
    6. +
    7. +Boost is a popular collection of C++ libraries, covering a wide range of uses, + generally accepted as the next tool to reach for beyond the standard library. + Many standard library features, such as smart pointers, were initially developed in Boost. + Alex speaks positively of some parts (see his foreword for “The Boost Graph Library”), but others he is more critical of.
    8. +
    9. +Some algorithms can be implemented more efficiently for certain +iterator categories. For example std::distance can be +implemented as a constant time algorithm for RandomAccessIterators but +only a linear time algorithm for other iterator categories. The +iterator_category tag allows the appropriate algorithm to be selected at +compile time. This technique is known as tag dispatch.
    10. +
    11. +See “A look back at memory models in 16-bit MS-DOS” +for a brief overview of these various pointer types. +The more general concept behind them is memory segmentation.
    12. +
    13. I think modern compilers have fixed this so you can write checks with more confidence that they won’t affect release builds. Specifically the standard has mandated some rules about when -assert is enabled.
    14. -
    15. -The Swift language and standard library -actually implement very similar concepts and containers -as C++. In their equivalent to ForwardIterator -they actually did require comparable, -and bounds check with it, -making it very difficult to write data structures like linked lists -and probably makes Alex’s pointer comparison trick impossible.
    16. +assert is enabled. +
    17. +The Swift standard library actually takes a lot of inspiration from C++ and Alex’s work. +For example, the protocol Collection has an Index type which is equivalent to ForwardIterator, +with some differences. +One of these is the Index type must be comparable, +in order to support safety features like bounds checking. +This restriction makes it difficult to write data structures like linked lists +and probably makes Alex’s pointer comparison trick impossible.
    diff --git a/09_iterators.md b/09_iterators.md index f169255..b9dc194 100644 --- a/09_iterators.md +++ b/09_iterators.md @@ -1,4 +1,5 @@ -# 9. Iterators +9. Iterators +==================== ## History of iterators @@ -9,7 +10,7 @@ other algorithms and learn to write iterators right once and for all. Let me tell you a little about them. -Anybody who programs in C++ is forced to use vector +Anybody who programs in C++ is forced to use `std::vector` and those have iterators. But, there are lots of people who do not quite understand what they are, partially because iterators are called iterators. @@ -25,12 +26,9 @@ It never existed. It's a mythical language. It was never implemented. But, many people at [CMU][cmu] got tenure because of it. It has some interesting ideas, including the idea of a generator. -For those of you who know [Python][python], it is like an [iterator in Python][py-iterator]. -Barabra Liskov said, wouldn't it be nice to write: - - for x in thing - -and iterators allow you to do that. +For those of you who know [Python][python], it is like an iterator in Python[^python-iterator]. +Barbara Liskov said, "wouldn't it be nice to write something like: `for x in thing`". +Iterators allow you to do that[^clu-iterators]. It is like a generator in Alphard. It is a procedure which returns multiple values, one by one. It's a procedural abstraction. @@ -40,48 +38,83 @@ It was a generalization of a control structure. At the same time, I was working on how to do algorithms and I introduced the notion of position. -A better name is coordinate, the name which Paul and I are use -in our book "Elements of Programming"[^eop]. +A better name is coordinate, the name which Paul and I use +in our book "Elements of Programming". A coordinate is some way of indicating where in the data structure you are. -It is not a control structure it's just the pointer into a data structure, +It is not a control structure, it's just the pointer into a data structure, or generalized notion of a coordinate. -It is something which allows me to navigate through to the data structure in a natural way the -data structure navigates. +It is something which allows me to navigate through the data structure in a +natural way[^coordinate-references]. Eventually I started talking to C++ guys and showed them coordinates and they said, "we call it iterator". Of course they didn't call what I had an iterator. They had some C++ code where they were trying to do CLU like iterators, heavy state procedural -things instead of lightweight pointer. +things instead of lightweight pointers. So, I chickened out. I said, "Yes, that's exactly what you guys have". I thought it was much better to win with the wrong name than lose with the right name, so the name is stuck. -It's in the standard, but again the concept which designates is not a -concept of iterator in CLU or iterator in Python which is not iterator in Java. -An iterator is a generalization of a coordinate in a data structure. -It's a lightweight thing, it doesn't do anything. -It just points to things. - -This argument is active again. -There are [boost][boost] guys -who say, "Iterator is all wrong. Let us, go back +It's in the standard. But again, the concept which it designates is not a +concept of iterator in CLU or iterator in Python or iterator in Java. +Our iterator is a generalization of a coordinate in a data structure. +It's a lightweight thing. It doesn't *do* anything, +it just *points* to something. + +There are these arguments which I hear from people like the +Boost[^boost] guys, who say "Iterators are all wrong. Let us go back and do ranges." -They're reinventing Barbara Liskov iterators. -Just for the record, when I introduced my iterators, -I was very familiar -with the iterators in CLU. - Moreover I was even very familiar with Barbara herself -and with [Alan Snyder][snyder] who designed iterators. +Guess what? They're reinventing Barbara Liskov's iterators. +Just for the record, when I introduced my iterators +I was very familiar with the iterators in CLU. +Moreover I was even very familiar with Barbara herself +and with [Alan Snyder][snyder] who co-designed the iterators in CLU. I didn't do what they did because I wanted to do something else. It wasn't out of ignorance. Maybe I was stupid, but I wasn't ignorant. -[^eop]: Alex recommends chapter 7 of "Elements of Programming" - on coordinate structures. - +[^coordinate-references]: See chapter 7 of "Elements of Programming" on coordinate structures. + An interesting discussion on the general idea of "coordinatisation" + is found in chapter 1 of "Basic Notions of Algebra" by Shafarevich. + +[^boost]: [Boost][boost] is a popular collection of C++ libraries, covering a wide range of uses, + generally accepted as the next tool to reach for beyond the standard library. + Many standard library features, such as smart pointers, were initially developed in Boost. + Alex speaks positively of some parts (see [his foreword][alex-graph-foreword] for "The Boost Graph Library"), but others he is more critical of. + +[^python-iterator]: + Alex almost certainly means Python generators, not iterators, but I will describe both. + + An [iterator](https://wiki.python.org/moin/Iterator) in Python is any object which implements a method called `__next__()`. + Unlike C++ iterators, the `__next__()` always returns another element of the sequence, not an iterator, + so they do not resemble pointers or coordinates, neither are they comparable. + They are most similar to `InputIterator` in that previous values in the sequence become inaccessible after advancing. + The only thing special about the `__next__()` method is its compatibility with language constructs like `for` loops. + + A [generator](https://wiki.python.org/moin/Generators) in Python is a kind of `iterator` that is typically implemented + as a function with some helpful syntax additions. + In a generator function, the `yield` keyword is used to return the next value in the sequence. + If an additional value is requested after yielding, + the function will resume at the point of the previous call to `yield`. + This makes writing complex sequences more natural, as the control flow operates like other code. + For example, the following returns square numbers: + + def square_nums(count): + k = 0 + while k < count: + yield k * k + k += 1 + + It can be used in a `for` loop: + + for x in square_nums(10): + print(x) + # 0 1 4 ... + +[^clu-iterators]: See this [brief description of CLU iterators](http://web.mit.edu/ghudson/info/iterators). + [liskov]: https://en.wikipedia.org/wiki/Barbara_Liskov [clu]: https://en.wikipedia.org/wiki/CLU_(programming_language) [alphard]: https://en.wikipedia.org/wiki/Alphard_(programming_language) @@ -90,76 +123,66 @@ but I wasn't ignorant. [py-iterator]: https://wiki.python.org/moin/Iterator [boost]: https://www.boost.org/ [snyder]: https://dblp.org/pid/04/4444.html +[alex-graph-foreword]: http://stepanovpapers.com/siekforeword.pdf -## List pool iterators + +## Affiliated types for iterators We always need to distinguish between how we type code in C++ and what the notion is behind it. A key part of iterators is **affiliated types**. Iterators point to values and you want to know what those values are. Types don't work on their own. -They come in clusters, or connected families. -If you have `int*` there is an affiliated type `int` -which are related. -We want to be able to obtain `int` from `int*`. -Even in Python which has [duck typing][duck]. +They come in clusters, or connected families of types. +If you have a type `int*`, there is an affiliated type `int`. +The two types are related. It would be terribly nice if we had a +way to obtain `int` from `int*`. That is, if somebody +gives me a pointer type, I want a way to find out what type +it points to. + +To do this we need this notion of **type functions** which accept one type +and return a different type. +This problem of needing to obtain affiliated types is not specific to C and C++. +It appears in Java and Python as well. +In spite of Python's [duck typing][duck] there's *still* a connection +between types, even if they are duck types. + +So we need this notion of type functions, +but C doesn't let us do this, +and neither does C++. +Instead of type functions we are going to solve +this problem for iterators by using `typedef`s. For an iterator, there are [5 types][cpp-iterator-traits] that we always need to define. -3 are primary, 2 are secondary. +Three of them are primary, two are secondary. Let's start with the primary: 1. `value_type`: the type of the value it points to. 2. `difference_type`: Difference between pointers ([`ptrdiff_t`][ptrdiff] in C). Unlike C we might need a different type. - The length between elements in a range, depends on the range type. + The length between elements in a range depends on the range type. It is an integral type large enough to encode any valid range of a given iterator. 3. `iterator_category`: - Once again this is a general notion, not just C++. - There are `ForwardIterators`, `RandomAccessIterators`, - `InputIterators`, etc. - They are different theory of iterators. - - In C++ (without concepts) we use tag types. - Every iterator uses a tag type to signify what theory - it supports. - The tag lets you do compile type dispatch. - - What category is this iterator? - In the list pool, our iterator is a `ForwardIterator`. - Because, there is no way in a singly linked list - to go backward. - -Let's define these types in the `list_pool` -from last time. - - #include + Once again we need to distinguish between how we type this in C++ and what + the notion behind it is. + The notion here is that there are different categories, or theories, + of iterators: `ForwardIterators`, `RandomAccessIterators`, + `InputIterators`, etc... - // class list_pool { - // ... + In C++ (without concepts) we use tag types to designate the iterator categories. + Every iterator uses a tag type to denote which theory it supports. + The tag lets you do compile time dispatch[^compile-time-dispatch]. - struct iterator { - typedef list_pool::value_type value_type; - typedef list_pool::list_type difference_type; - typedef std::forward_iterator_tag iterator_category; - }; +### Historical artifacts - // }; - - -[duck]: https://en.wikipedia.org/wiki/Duck_typing -[ptrdiff]: https://en.cppreference.com/w/c/types/ptrdiff_t -[cpp-iterator-traits]: https://en.cppreference.com/w/cpp/iterator/iterator_traits - -### Iterator reference types - -The 4th and 5th types to be defined are required only for historical reasons. +The fourth and fifth types to be defined are required only for historical reasons. Microsoft said they were going to vote against STL unless it accommodated multiple memory models. At the time, they had tiny pointers, huge pointers and -[far pointers][far-ptr-article]. +far pointers. They wanted STL to somehow work with all of them. I had to figure out how they work. @@ -167,16 +190,16 @@ The difference between far pointer and huge pointer is really weird. They are both 32 bits. But, with far pointer if you add -one to it, and the first low two bytes overflow, it rotates without propagation. -With a huge, it does propagated to the 16th bit. +one to it, and the two lowest bytes overflow, +they wrap without propagation to the upper bytes. +With a huge pointer, the carry *is* propagated to the upper bytes, but by adding +8 to them[^ms-dos-pointers]. -So, they -they demanded that I change the whole architecture to accommodate them. +So they demanded that I change the whole architecture to accommodate them. Guess how they voted? No. Now we're stuck for the next hundred years with stuff which was included to placate people that - couldn't have been placated. - +couldn't have been placated. So what does an iterator return when you dereference it? Normally a reference. It's an [`lvalue`][lvalue] @@ -185,30 +208,62 @@ But, with far and tiny pointers, you don't know what the reference type should be. So now we need to provide it. -4. `reference`: the type of a reference to the value. -5. `pointer`: the type of a pointer to the value. - + 4. `reference`: the type of a reference to the value.
    + 5. `pointer`: the type of a pointer to the value. -Here is our implementation: +It's not particularly harmful, but it obfuscates things +and it provides "language experts" with steady employment. - typedef value_type& reference; - typedef value_type* pointer; +[^ms-dos-pointers]: See ["A look back at memory models in 16-bit MS-DOS"][far-ptr-article] + for a brief overview of these various pointer types. + The more general concept behind them is [memory segmentation][memory-segmentation]. -It's not particularly harmful, bu it obfuscates things -and it provides "language experts" with steady employment. +[^compile-time-dispatch]: Some algorithms can be implemented more efficiently for certain + iterator categories. For example [`std::distance`][std-distance] can be + implemented as a constant time algorithm for `RandomAccessIterators` but + only a linear time algorithm for other iterator categories. The + `iterator_category` tag allows the appropriate algorithm to be selected at + compile time. This technique is known as [tag dispatch][tag-dispatch]. +[duck]: https://en.wikipedia.org/wiki/Duck_typing +[ptrdiff]: https://en.cppreference.com/w/c/types/ptrdiff_t +[cpp-iterator-traits]: https://en.cppreference.com/w/cpp/iterator/iterator_traits +[std-distance]: https://en.cppreference.com/w/cpp/iterator/distance +[tag-dispatch]: https://quuxplusone.github.io/blog/2021/06/07/tag-dispatch-and-concept-overloading [lvalue]: https://en.wikipedia.org/wiki/Value_(computer_science)#lrvalue [far-ptr-article]: https://devblogs.microsoft.com/oldnewthing/20200728-00/?p=104012 +[memory-segmentation]: https://en.wikipedia.org/wiki/Memory_segmentation + +## List pool iterators + +Let's define these types in the `list_pool` iterator. +What category is the iterator for `list_pool` from last chapter? +We need `ForwardIterator`, because there is no way in a singly linked list +to go backwards. + + #include + + // class list_pool { + // ... + + struct iterator { + typedef list_pool::value_type value_type; + typedef list_pool::list_type difference_type; + typedef std::forward_iterator_tag iterator_category; + typedef value_type& reference; + typedef value_type* pointer; + }; + + // }; ### Constructors Let's write constructors for our iterator: iterator() {} - iterator(list_pool& p, list_pool.::list_type node) : + iterator(list_pool& p, list_pool::list_type node) : pool(&p), node(node) {} - We should explicitly call a constructor when we can. Default constructor shouldn't really be used because it guarantees only a partially formed value. @@ -244,9 +299,9 @@ you have to put aside thoughts that things could be done differently. If you work in C, you write C. -### Preincrement, postincrement +### Pre-increment, post-increment -When you increment the iterator should +When you increment, the iterator should move to the next node. iterator& operator++() { @@ -263,10 +318,9 @@ move to the next node. `int` here doesn't do anything, it's just to distinguish pre and post. -Preincrement could be automatically generated +Post-increment could be automatically generated and it would be criminal to do anything else. - ### Equality It's customary here to @@ -280,7 +334,7 @@ So let's try to define it without referring to copy. friend bool operator==(const iterator& x, const iterator& y) { - // assert(x.pool === y.pool); + // assert(x.pool == y.pool); return x.node == y.node; } @@ -293,7 +347,11 @@ We could also complain about using `==` instead of `=` for equality, as it violates mathematical tradition. But, oh well. When you took Algebra in grade school, -they use `=` and they use `x` and `y` and I think it's good. +they used `=` and they used `x` and `y` and I think it's good. + +**Exercise:** Experiment with list pool iterators by using + a standard library algorithm on them, such as `find` or `copy` + (see `test_list_pool_iterator.cpp` at the end of the chapter). ## Thoughts about iterator design @@ -301,9 +359,9 @@ they use `=` and they use `x` and `y` and I think it's good. Notice I sometimes write assertions in comments: - // assert(x.pool === y.pool); + // assert(x.pool == y.pool); -I don't write it because it takes too long to check[^assert]. +I don't use a real assert because it takes too long to check[^assert-modern-compilers]. There is nobody who should be comparing iterators from separate pools. If he does, he deserves what he gets. But, wouldn't it be good to guarantee safety? @@ -318,25 +376,25 @@ Turing machines are fundamentally unsafe. What about compiling debug mode? It's mildly useful. The only truly useful thing is to decompose -your program into clear subroutines and unit +your program into clear subroutines and clear units which you understand. That's the only thing I know that works. The reason to program in C++ is to have access to the machine. -We want pointers. +You want to have these unsafe things called pointers. If you want a language which hides the machine, -then use it. +then use it. It has its advantages. Python is good for writing scripts. But don't write your operating system in Python. Don't write your search engine. -[BASIC][basic] and [Cobol][cobol] were wonderful for what they are. -I wouldn't use Cobol to write an operating system. +[BASIC][basic] and [COBOL][cobol] were wonderful for what they are. +I wouldn't use COBOL to write an operating system. [basic]: https://en.wikipedia.org/wiki/BASIC [cobol]: https://en.wikipedia.org/wiki/COBOL [cpp-assert]: https://en.cppreference.com/w/cpp/error/assert -[^assert]: I think modern compilers have fixed this so you can +[^assert-modern-compilers]: I think modern compilers have fixed this so you can write checks with more confidence that they won't affect release builds. Specifically the standard has mandated some rules about when [assert][cpp-assert] is enabled. @@ -345,11 +403,11 @@ I wouldn't use Cobol to write an operating system. What is the basic example of a `ForwardIterator`? A singly linked list. -Forward iterators do not have `<`. +Forward iterators do not have 'less-than' defined on them. Is that a good idea? You could do it for linked lists, but it's very expensive and not even guaranteed to terminate[^swift-comparable]. -But you're assuming "If I am `<`, then I am before". +But you're assuming "If I am less-than, then I am before". This is only one interpretation. Let me explain. @@ -364,7 +422,7 @@ search, whether in a map, or whether in a sorted array. For example, suppose I want to see if an iterator is in the structure, such as inclusion in a set. The only effective way is comparison. -I was torn because I knew I could not include it, +I was torn because I knew I could not include it, because most people would attempt to write code in STL the old way they learned which is: @@ -378,13 +436,16 @@ write it, and it will compile and it will work. So, that was not an option. -[^swift-comparable]: The Swift language and standard library - actually implement very similar concepts and containers - as C++. In their equivalent to `ForwardIterator` - they actually [did require comparable](https://forums.swift.org/t/dropping-comparable-requirement-for-indices/3290), - and bounds check with it, - making it very difficult to write data structures like linked lists +[^swift-comparable]: The [Swift][swift] standard library actually takes a lot of inspiration from C++ and Alex's work. + For example, the protocol [`Collection`][apple-swift-collection] has an `Index` type which is equivalent to `ForwardIterator`, + with some differences. + One of these is the `Index` type [must be comparable](https://forums.swift.org/t/dropping-comparable-requirement-for-indices/3290), + in order to support safety features like bounds checking. + This restriction makes it difficult to write data structures like linked lists and probably makes Alex's pointer comparison trick impossible. + +[swift]: https://en.wikipedia.org/wiki/Swift_(programming_language) +[apple-swift-collection]: https://developer.apple.com/documentation/swift/collection ### Everything on a computer is totally ordered @@ -399,25 +460,22 @@ Equality is naturally extendable to struct. It was an oversight on the part of C and still C++ again. The compiler should generate equality, and it could generate inequality using -[lexographical][lex] ordering. +[lexicographical][lex] ordering. So if you want to use iterators -on set you can define a custom comparator. -One, which compares the addresses of the elements to which iterator points. +on set, you can define a custom comparator. +One which compares the addresses of the elements to which the iterator points. It has nothing to do with before and after, but it establishes a total ordering. A total ordering does not have to be topologically induced by the traversal. [lex]: https://en.wikipedia.org/wiki/Lexicographic_order -**Exercise:** extend the iterator to linked iterator - so we can assign the next on a node. - Specifically we want to be able to modify the successor of - an iterator. - +**Exercise:** Extend the list pool iterator + with the ability to modify the `next` of the node it points to + (this is discussed and solved in chapter 12). + ## Code - [list_pool_iterator.h](code/list_pool_iterator.h) - - - +- [test_list_pool_iterator.cpp](code/test_list_pool_iterator.cpp) diff --git a/10_binary_counter.html b/10_binary_counter.html index d0cd6d6..025e232 100644 --- a/10_binary_counter.html +++ b/10_binary_counter.html @@ -2,116 +2,9 @@ + 10. Balanced binary reduction - + @@ -133,20 +26,20 @@

    10. Balanced binary reduction

    Alice in wonderland

    -

    Let us attack the problem of finding not just the smallest +

    Let us introduce the problem of finding not just the smallest of n elements, but the smallest and second smallest. -The problem has a very distinguished pedigree it was first addressed by a well-known British mathematician +The problem has a very distinguished pedigree, it was first addressed by a well-known British mathematician Charles Dodgson (Lewis Carroll). If you haven’t heard of him, you should. -There is a very important book which he wrote, not the mathematical book -, but the book called “Alice’s Adventures in Wonderland”. -If you haven’t read it, do. -No person should be hired ever unless you read “Alice in Wonderland”. -in any case he was also a mathematician. -He also dabbled in all kind of games -apparently he invented Scrabble and bunch of other games.

    - -

    At some point he decided that there is a clear problem with tennis tournaments. +There is a very important book which he wrote, not the mathematical book, +but the book called “Alice’s Adventures in Wonderland”. +If you haven’t read it, do1. +No person should be hired ever unless they read “Alice in Wonderland”. +In any case, he was also a mathematician. +He also dabbled in all kind of games. +Apparently he invented Scrabble and a bunch of other games2.

    + +

    At some point he decided that there is a clear problem with lawn tennis tournaments. He observed that with a very high probability, if you have say a tournament with 64 players, the guy who gets the second prize is actually not the second strongest. @@ -154,110 +47,111 @@

    Alice in wonderland

    second strongest could be paired in the first round. Therefore the second strongest guy gets eliminated and doesn’t get the second prize, in spite of his prowess. -This is is why now they use a technique known is seeding to assure +This is why they now use a technique known as seeding to assure that people of similar ability are spread out to different parts of the tree. -But, he wanted to come up with an algorithm -which assures that the second guy is truly the second guy. -He published it in 1883. -The algorithm wasn’t quite an algorithm and it was clearly not optimal. -It took 15 more years before the problem was stated correctly. -People realized that you could talk about minimum +But he wanted to come up with an algorithm +which assures that the second-placed guy is truly the second best player. +He published it in 18833. +The algorithm wasn’t quite an algorithm4 and it was clearly not optimal. +It took 50 more years before the problem was stated correctly. +People realized that you could talk about the minimum number of comparisons but it took another thirty years, until 1964 when a Russian mathematician Sergei S. Kislitsyn published a paper which proved there is an optimal algorithm and described it.

    -

    By the way, all of this information is available1 in -a -book called “The Art of Computer Programming” by Donald Knuth . +

    By the way, all of this information is available in a +book called “The Art of Computer Programming” by Donald Knuth5. You really should buy it. You make a certain commitment. You spend $150 saying that you really care about programming. -This is not a book which is “useful” meaning it’s not “programming in Python for idiots”, +This is not a book which is “useful”, meaning it’s not “programming in Python for idiots”, or “information retrieval for 21st century” or something like that. This is one of the fundamental books which you buy and then spend your lifetime getting the information out of it. -You get beautiful things which you then could use for programming in Python or information retrieval or -other things. +You get beautiful things which you then could use for programming in Python or information retrieval or other things. I’m going to be mentioning Knuth throughout the course. It’s not a perfect book, it is just the greatest book we’ve got. Some people think it’s a good reference book. No, it’s not, because you have to basically do linear search to find what you’re interested in. -Another important thing, do not spend too much time solving problems. +Another important thing; do not spend too much time solving problems. Read the solutions. -They are right at the end. +They are right at the end. Lots of very important algorithms are described in the solutions to his problems. Reading Knuth has to become a lifelong activity.

    Smallest and second smallest element

    -

    The problem of finding the smallest and second smallest element, -is in “The Art of Computer Programming”, but somehow Knuth does not implement it. -He describes it fully but doesn’t implement it as an algorithm. -As we shall see its actually a little tricky. -I pick this algorithm not because it is of paramount importance for your future work. -I pick this algorithm because it allows us to learn how to do decomposition -and learn components along the way (like lists).

    +

    The problem of finding the smallest and second smallest element +is described fully in “The Art of Computer Programming”, but somehow Knuth does not implement it. +As we shall see it’s actually a little tricky.

    How many comparisons do you need to solve this problem? -Same as min-max from last time? +Same as minmax_element from last time? No. You can do it in fewer. -Let us try to use some logic . -How many comparisons do we need to find the winner of the -tournament? n - 1. -It is necessary to find the winner, in order to find the second place guy. -We could sketch a proof. -Let us assume there are two potential guys greater than him. -If there is none, he isn’t second place. -If there are two, he isn’t second place either.

    - -

    What do we know about second place, -specifically the games he lost. +Let us try to use some logic. +How many comparisons do we need to find the winner of the tournament? n - 1 +because it is necessary to find the winner in order to find the second place guy. +We could sketch a proof of this. +Let us assume there are two potential guys greater than the second place guy. +If neither are greater than him, he is first place, not second place. +If both of them are, he isn’t second place.

    + +

    What do we know about the second place guy, specifically about the games he lost? He only lost one game, and it was to the winner. -If the winner remembers all the games he won, -how could we determine second place? -How many people did he beat?

    +This is a very important property which tells us why we don’t need to do many comparisons. +If the winner remembers all the games he won, and who he played how do we find second place? +We determine the best from the subset of players he beat.

    -

    For example, Wimbledon has 64 players who are admitted. +

    How many people does the winner beat to win? +For example, Wimbledon has 64 players who are admitted. The winner doesn’t play 63 games. -The tournament is structured as a binary tree. +The tournament is structured as a binary tree. This tree is how deep? If you have n elements? It’s ceil(log_2(n)). We could somehow arrange our tournament so that the -winner will defeat this many people. -Now that we have a list of the ceil(log_2(n)) people who played the winner, -we just find the best out of them which is -ceil(log_2(n)) - 1 comparisons.

    +winner will defeat this many people.

    + +
     winner: d
    + second place: a or c
     
    -

    To review, we -have n - 1 comparisons to get the winner. -ceil(log_2(n)) - 1 to get the second best, -from the list he played. -So the upper bound for the algorithm is:

    + d + / \ + a d + /\ /\ + / \ / \ +a b c d ceil(log_2(4)) = 2 +
    + +

    Now that we have a list of the ceil(log_2(n)) people who played the winner, +we just find the best out of them which is ceil(log_2(n)) - 1 comparisons.

    + +

    To review, we have n - 1 comparisons to get the winner. +ceil(log_2(n)) - 1 to get the second best, from the list he played. +So an upper bound on the comparisons for the algorithm is:

    n + ceil(log_2(n)) - 2
     

    Actually implementing this algorithm effectively -will require us to create several components. -We will build these up over the next few lessons.

    +will require us to create several components.

    - -

    Unoptimal divider and conquer approach

    + +

    What about divide and conquer?

    It might appear you could use divide and conquer. -First split it in two, -find min and second min of the first half, +First split the list of elements in two, +find the min and second min of the first half, and the second half, and then merge them together doing two comparisons. It sounds very elegant because it’s all recursive. But, let us think about how many comparisons it’s going to do -with simple mathematics.

    +using simple mathematics.

      -
    1. We start with n we need to pair and compare them. -So the first round is n/2 comparisons.

    2. +
    3. We start with n. We need to pair and compare them, +so the first round is n/2 comparisons.

    4. In the second round we pair up the results, so we need n/4 “games”. But, each game requires two comparisons (to find min and max). @@ -266,24 +160,23 @@

      Unoptimal divider and conquer approach

    -

    So that would be:

    - -
    n/2 + (n/2 + n/4 + n/8 + ...)
    +

    So the total number of comparisons would be:

    +
      n/2 + (n/2 + n/4 + n/8 + ...)
     = n/2 + n-1
    += 3n/2 - 1
     
    -

    total comparisons. -That’s not what we’re trying to accomplish, -so divide and conquer doesn’t always do what we think.

    +

    That’s not what we’re trying to accomplish, so divide and conquer doesn’t always do what we think.

    Tournament tree shapes

    -

    First, we need rearrange the tournament we play. +

    To get the number of comparisons that we want, we need to rearrange the tournament we play. Right now min_element plays a tree structure that looks like this:

    -
    Unbalanced Tree
    +
    unbalanced tree
    +
     /\
       /\
         /\
    @@ -292,11 +185,12 @@ 

    Tournament tree shapes

    It has n - 1 internal nodes. -But, we don’t want the winner to play n - 1 matches. -We need to transform that into the way they play Tennis. +But we don’t want the winning element to be compared n - 1 times. +We need to transform that into the way they play tennis tournaments. We need to balance the tree.

    -
    Balanced Tree
    +
    balanced tree
    +
         / \
        /\ /\
         ...
    @@ -305,119 +199,128 @@ 

    Tournament tree shapes

    How do we do it? One way is to just pair up elements and build up. -But then we need lots of memory to save the intermediate results. -What do we mean when we say lots of memory? -O(n) is bad, O(sqrt(n)) is pretty bad.

    - -

    Note that once a bottom-level round has been played, -they are ready to move up. -Our goal is to basically to become eager. -Whenever guys are ready to be paired we want to pair them. -So if we only store only the winner at each level, -we never need to store log(n) things. +But then we need lots of memory to save the intermediate results6. +Note that once a bottom-level round has been played, they are ready to move up. +Our goal is basically to become eager. +Whenever elements are ready to be paired together, we want to pair and compare them.

    + +

    So if we store only the winner at each level, we only need to store log(n) things. We can define the power of each element to be the number of games they have played.

    Realize that suddenly we see something which has nothing to do with our problem. -The foundation of our algorithm is the ability to take a tree like -the linear (unbalanced) tree and transform it into a balanced tree. +The foundation of our algorithm is the ability to take a tree, like +the linear (unbalanced) tree, and transform it into a balanced tree. What mathematical property allows us to do such a transformation? -Specifically why can we convert one kind of computation to the other. -Associativity2. +Why can we convert one kind of computation to the other? +Associativity7. As long as our operation is associative, -What property don’t we need? Commutativity. -We keep them in the same order, -we just rebalanced parenthesis.

    - -

    If you think about it, -our min is not quite commutative. -In mathematics min is commutative. -But, because we want to preserve stability it is not. -We distinguish between the left and right argument.

    +what property don’t we need? Commutativity8. +We keep the elements in the same order, +we’re just rebalancing parentheses9.

    Binary counting and reduction

    Here we come to the amazing idea of how to do this transformation. -This is one of the most beautiful ideas which they -kept secret from you. +This is one of the most beautiful ideas which they kept secret from you. They should have taught it in high school. -But, they want to publish papers -themselves and not tell you the general mechanism.

    +But, they want to publish papers themselves and not tell you the general mechanism.

    + +

    Let us assume we have elements of type T that need to be paired or combined in some way, +whether with min, +, merge, or any other associative operation on T. +What we can do is create an array called a “counter”.

    -

    We can create a counter, and in every bit of this counter we’re -going to keep a singleton. -In each bit we keep the person who had n victories. -We will never combine things unless they have the same weight/parity.

    +
       index: 0  1  ...   31
    +contents: x1 x2  ...  x32
    +
    + +

    The nth slot of the counter will store the element that has had n “victories” so far. +So if there is a guy in slot 0 he hasn’t played any games yet. +If there is a guy in slot 2 he has won 2 games, and so on. +This structure will help us to only pair up elements that have the same power.

    + +

    The following example using min as the operation should make this clear. +Initially the counter has zero in every entry:

    -

    Initially the counter has zero in every entry:

    +
    initial counter
     
    -
    1 2  ...   32 (bits/singletons)
    -0 0  ...   0
    +   index: 0 1  ...   31
    +contents: 0 0  ...   0
     

    Take a new guy x who has never played any games, -and you look at the guy in the first slot of the counter. +and look at the guy in the first slot of the counter. The existing guy is either zero or not. -If it’s zero (_), put him in the counter.

    +If it’s zero, put the new guy x in the counter at index 0 (he has not played any games).

    -
    1 2  ...   32         1 2  ...    32
    -0 0  ...   0    >>>   x 0  ...    0
    +
    0 1  ...   31           0 1  ...   31
    +0 0  ...   0     -->    x 0  ...   0
     
    -

    If it’s not zero, he plays a game with the existing guy. -If he wins, he replaces the loser in the counter.

    +

    Now take another guy y. Since x is in the first slot of the counter, we must pair them up. +The winner moves on up to the next slot in the counter, +as they have now won a game. +So if y wins:

    -
    1 2  ...   32           1 2  ...   32
    -y 0  ...   0     >>>    x 0  ...   0
    +
    0 1  ...   31           0 1  ...   31
    +x 0  ...   0     -->    0 y  ...   0
     
    -

    Otherwise, the existing guy has now won a game. -So he needs to be promoted to the next level, -he follows the same rules with the guy in that slot. -It’s a carry propagation.

    +

    Otherwise x wins:

    + +
    0 1  ...   31           0 1  ...   31
    +x 0  ...   0     -->    0 x  ...   0
    +
    -
    1 2  ...   32           1 2  ...   32
    -y 0  ...   0    >>>     0 y  ...   0
    +

    What if the index 1 slot was non-zero, after comparing x and y? +Then the guy there already won one game. +So, we must carry propagate10. +Repeat the same process all the way up the counter, until we find a slot which is zero. +What if the counter is full, and has no zero slots? +That’s called an overflow.

    + +

    We borrow terminology from binary integer arithmetic because +our counter works just like a binary integer counting up:

    + +
    0 0 0
    +1 0 0
    +0 1 0
    +1 1 0
    +0 0 1
    +1 0 1
    +0 1 1
    +1 1 1
     
    -

    If we end up with a guy in slot 32, it’s an overflow, -exactly like integer arithmetic. -What do we do? -Whenever we don’t know to proceed, -do something sensible and let whomever uses it figure out what is a sensible thing -to do. -Return the carry3.

    +

    But instead of 0 and 1 in each slot or “bit” we have arbitrary elements that are combined with an associative operation.

    + + +

    Handling overflow

    + +

    What do we do if the counter overflows? +Whenever we don’t know how to proceed, +do something sensible and let whomever uses it figure out what is a sensible thing to do. +Return the carry. +If the return is non-zero the programmer who called the counter will know +it overflowed and can decide what to do. +Maybe they will extend the counter or throw an error. +It’s his business not ours.

    Let us be lazy. -The great success in -programming comes because there are lazy people who say, “I don’t want to know now, +The great success in programming comes because there are lazy people who say, +“I don’t want to know now, I’ll find out later.” Right now we are solving this problem. We have an associative binary operation of some kind - and what we discovered that if we have associativity, +on type T and what we discovered is if we have associativity, we can make this counter and it will work for us.

    -

    If you are familiar with numerical analysis, -whenever you sum up large number you don’t really want to -add small quantities to big quantitative. -Bad things happen to the errors4. -So, you could use the same device for balancing your addition. -If you want to implement merge sort you can use exactly the same device, since -merge is associative5. -The idea with merge sort, is only want to merge lists if they are roughly the same length -and this helps you do it.

    - -

    When we become grownups we learn about advanced data structures, -such as binomial forest. -They use the same idea. -The counter helps us combine things only when they have the same weight. -nless they are of the same weight it’s a general algorithmic technique

    -

    Implementation

    -

    The first function will add an element to the counter +

    Now we have to write the code. +The first function will add an element to the counter using the process we just described.

    template <typename T, typename I, typename Op>
    @@ -444,12 +347,19 @@ 

    Implementation

    Elements in the counter got there before, so they were to the left of the element we are inserting.

    -

    Notice that zero is const reference because we don’t plan to modify it, +

    Notice that zero is const T& reference because we don’t plan to modify it, but we do modify carry, so it should be passed by value.

    -

    The second function applies the operation -to all the elements left sitting in the counter.

    + +

    Reduction

    + +

    After we finish adding all our elements to the counter, they might not all be reduced to one element. +There may be several elements left sitting at various levels of the counter. +We need to do one more pass of the operation to combine them into the final result.

    + +

    This second function does that. It applies the operation, +in the same manner to the elements left sitting in the counter.

    template <typename T, typename I, typename Op>
     // requires Op is BinaryOperation(T)
    @@ -476,12 +386,8 @@ 

    Implementation

    gives you x, but sometimes that won’t happen. So we can’t really initialize to zero.

    -

    Exercise: Use these functions to sum up an array of doubles.

    - -

    Exercise: Rewrite min_element using these functions (just min_element, don’t worry about second best).

    - - -

    Binary counter object

    + +

    Binary counter class

    Start with algorithms

    @@ -490,7 +396,7 @@

    Start with algorithms

    these two algorithms and combine them into an object. We have two beautiful algorithms but they’re stateless. Everything is outside, and this is what we should always do, -no matter what we has been taught in software engineering classes by very wise +no matter what we have been taught in software engineering classes by very wise object-oriented professors. You start with algorithms. You don’t start with objects. @@ -498,12 +404,11 @@

    Start with algorithms

    But you don’t have to stop there. Because you can then put things together into an object.

    -

    It’s very easy when you write an algorithm to -have a minimal iterator interface. -They externalize the counter. +

    It’s very easy when you write an algorithm to have a minimal iterator interface. +In this case, the iterators externalize the counter. We say, “we don’t want to know about him, we’re just algorithms people”. -We assume the principle that we will have no state for about +We assume the principle that we will have no state for about five minutes, and stay very functional, but then turn around and deal with state. We look at the whole thing.

    @@ -515,7 +420,7 @@

    Counter storage

    How should we store the counter? Don’t we have millions of elements to reduce? The counter is size log(n) which will -never be greater than 64. +never be greater than 64. So, it’s actually a small fixed size. We will store it in a std::vector.

    @@ -545,26 +450,38 @@

    Counter storage

    };
    -

    Counter is private, we don’t +

    counter is private, we don’t want people to muck up our counter. -Same with our operation

    +Same with our operation.

    Using initializer lists instead of assignment in constructor is important. If you initialize in the body, it will first call default constructors for members, then you overwrite all the work with an assignment.

    I think it is very beautiful. -We could compete with Steve Jobs for elegance of our design6.

    +We could compete with Steve Jobs for elegance of our design11.

    + +

    Exercise: In numerical analysis, whenever you sum up large numbers you don’t really want to add small quantities to big quantities. + Bad things happen to the errors12. + Use this code to write a function which sums arrays of double.

    + +

    Exercise: Rewrite min_element using this code (just min_element, don’t worry about second best).

    + +

    Exercise: If you want to implement merge sort you can use exactly the same device, since merge is associative. The idea with merge sort, is you only want to merge lists if they are roughly the same length and this helps you do it (see Chapter 12). + Write the associative binary operation merge which can combine two sorted arrays into a sorted array.

    + +

    Exercise: When we become grownups we learn about advanced data structures, such as binomial forest. They use the same idea. Learn about this data structure and try to figure out where + the counter could be used.

    What is in-place memory usage?

    How significant is the storage of our counter? -We use the term “in-place” to indicate the memory +We use the term in-place to indicate the memory usage of an algorithm is not significant. A long time ago people thought -algorithms were in-place if they didn’t require any extra memory. -Then they decided constant memory was enough. +algorithms were in-place if they didn’t require any extra memory. +Then they decided constant memory was enough. But, then they said it doesn’t quite work because our favorite algorithm doesn’t work. Of course the most important algorithm is quicksort. It’s not in-place because it’s recursive. @@ -572,11 +489,10 @@

    What is in-place memory usage?

    Quicksort splits roughly in the middle let’s assume on average.

    So people say log(n) is good so that will count as in-place. -Then they said , “what if we have nested things. -So is log(n)^2 ok”? -In our universe log(n) <= 64 so this is 4096.

    - -

    Then theoreticians said it’s really alright if we have “poly-logarithmic” storage +Then they said, “what if we have nested things?” +So is log(n)^2 ok? +In our universe log(n) <= 64 so this is 4096. +Then theoreticians said it’s really alright if we have “poly-logarithmic” storage Basically as long as the memory requirement is O(p(log(n))) where p is a polynomial, it’s alright. Polynomials get a lot bigger than square, @@ -587,41 +503,99 @@

    Code


    1. -See chapter 5.3.3 in Volume 3 of “The Art of Computer Programming”.
    2. +The book is freely available from Project Gutenberg.
    3. -

      Associativity is a fundamental property studied in abstract -algebra. -Informally, it is that you can apply an operation to arguments -in any order you want. -Formally:

      +

      The invention of Scrabble is attributed to Lewis Carroll’s brief journal entry: +“A game might be made of letters, to be moved about on a chess board till they form words” (Dec 19th, +“The Life and Letters of Lewis Carroll” ). +See also “History of Scrabble” +and “The games of Lewis Carroll”.

      + +

      His book “The Game of Logic” teaches formal logic +using a board game. It is also available from Project Gutenberg.

    4. +
    5. +“Lawn tennis tournaments; the true method of assigning prizes with a proof of +the fallacy of the present method”. London, Macmillan and co., 1883.
    6. +
    7. +Knuth: it is not formulated precisely enough to qualify as an algorithm.
    8. +
    9. +See chapter 5.3.3 in Volume 3 of “The Art of Computer Programming”.
    10. +
    11. +Alex: What do we mean when we say lots of memory? +O(n) is bad, O(sqrt(n)) is pretty bad. +See the definition of “in-place memory usage” at the end of the chapter.
    12. +
    13. +

      A binary function f is associative +if the following holds for all a, b, c in its domain:

      f(f(a, b), c)) = f(a, f(b, c))
       
      -

      For example, multiplication:

      +

      Informally, f can be applied in any order. +For example, addition of integers is associative:

      -
      (a * b) * c = a * (b * c)
      +
      (a + b) + c = a + (b + c)
      +
      + +

      Subtraction of integers is not associative:

      + +
      (3 - 2) - 1 != 3 - (2 - 1)
       

      You can see why associativity is often referred to as being -able to “re-paranthesize”.

    14. -
    15. -

      The terms carry and carry propagation -are usually associated with the algorithm of binary addition -and especially its implementation as -an electronic/logical circuit called an adder.

      +able to “re-parenthesize” expressions.

    16. +
    17. +

      A binary function f is commutative +if for all a, b in its domain:

      + +
      f(a, b) = f(b, a)
      +
      + +

      Informally, f gets the same result, regardless of the order of the inputs. +For example, multiplication of integers is commutative:

      + +
      a * b = b * a
      +
      + +

      In Chapter 9.1 of “From Mathematics to Generic Programming”, Alex gives +a neat visual proof of this fact for integer multiplication:

      -

      An adder has two inputs a and b for +

                    * * *
      +* * * * *     * * *
      +* * * * *  =  * * *
      +* * * * *     * * *
      +              * * *
      +
      + +

      Or as Dirichlet put it: “Whether you arrange soldiers in rows or columns, you still have the same number of soldiers”.

      + +

      An example of an operation which is not commutative is string concatenation.

      + +
      "Hello, " + "World!" != "World!" + "Hello, "
      +
    18. +
    19. +Alex: If you think about it, +our min is not quite commutative. +In mathematics min is commutative. +But because we want to preserve stability, it is not. +We distinguish between the left and right argument.
    20. +
    21. +

      The terms carry and overflow +are closely associated with the implementation of binary counting +or addition as an electrical circuit, called an adder.

      + +

      A single bit adder has two inputs a and b for the two digits to add together. It outputs a digit s which is the digit to display for this place value. -There is supplementary output called a carry +There is a supplementary output called the carry, which is the amount that needs to move to the higher place value.

      @@ -635,7 +609,7 @@

      Code

    This adds a single digit, but we want to add entire numbers. -To, do so we add an additional input l for carried digits +To do so we add an additional input l for carried digits from lower place values.

    a  b  l    c  s
    @@ -648,7 +622,7 @@ 

    Code

    1 1 1 | 1 1
    -

    Now we can chain n of them togeter to be able to add n digits

    +

    Now we can chain n of them together to be able to add n digits

        s_1 c_1-----|      s_2 c_2----  ...
         |   |       |      |   |
    @@ -657,31 +631,30 @@ 

    Code

    l_1 a_1 b_1 |_ l_2 a_2 b_2
    -

    An overflow is when the last adder has a non-zero carry.

    -
  • -

    Numerical floating point calculation -is a subtle subject, but the basic issue Alex is -referring to is straightforward. -Floating point stores large numbers as decimals to some power. -If you add a small decimal to that, it may not change it all. -To try for yourself, compare the output of:

    - -
    (152500.0 * 5000.0)
    +

    An overflow is when the last adder has a non-zero carry.

  • +
  • +Alex: Maybe we should make it in China. +That’s a necessary prerequisite for beautiful design. +Designed in Cupertino, assembled in China. +So let’s try to assemble our machine in Palo Alto.
  • +
  • +

    Alex is referring to the following issue: +When floating point numbers become very large they can no longer represent small nearby increments. +So if many small numbers are accumulated in order, the sum may grow large enough to ignore the contribution of any particular element. +To see for yourself, observe that:

    + +
    double x = 1.45 * pow(2, 60);
    +assert(x == x + 1.0);
     
    -

    and

    +

    This is because floats are represented as a base and exponent, and the magnitude of the exponent constrains the precision.

    -
    (152500.0 * 5000.0) + 0.01
    -
  • -
  • -When I first learned about this algorithm, I showed a friend -and he immediately had the idea to use it -for merge sort, without any hint this was possible!
  • -
  • -Alex: Maybe we should make it in China. -That’s a necessary prerequisite for beautiful design. -Designed in Cupertino assembled in China. -So let’s try to assemble our machine in Palo Alto.
  • +

    If we forget about computers, and think about science or engineeing, it’s natural to consider:

    + +
    1.45 * 2^60 + 1 ~= 1.45 * 2^60
    +
    + +

    because the 1 is not very significant to the overall magnitutde.

    diff --git a/10_binary_counter.md b/10_binary_counter.md index 83f55b9..6e71735 100644 --- a/10_binary_counter.md +++ b/10_binary_counter.md @@ -3,20 +3,20 @@ ## Alice in wonderland -Let us attack the problem of finding not just the smallest +Let us introduce the problem of finding not just the smallest of `n` elements, but the smallest and second smallest. -The problem has a very distinguished pedigree it was first addressed by a well-known British mathematician +The problem has a very distinguished pedigree, it was first addressed by a well-known British mathematician [Charles Dodgson][carroll] (Lewis Carroll). If you haven't heard of him, you should. -There is a very important book which he wrote, not the [mathematical book][carroll-logic] -, but the book called ["Alice's Adventures in Wonderland"][alice]. -If you haven't read it, do. -No person should be hired ever unless you read "Alice in Wonderland". -in any case he was also a mathematician. -He also dabbled in all kind of games -apparently he [invented Scrabble][scrabble] and bunch of other games. - -At some point he decided that there is a clear problem with tennis tournaments. +There is a very important book which he wrote, not the [mathematical book][carroll-logic], +but the book called ["Alice's Adventures in Wonderland"][alice-in-wonderland]. +If you haven't read it, do[^alice-free-ebook]. +No person should be hired ever unless they read "Alice in Wonderland". +In any case, he was also a mathematician. +He also dabbled in all kind of games. +Apparently he invented Scrabble and a bunch of other games[^inventor-of-scrabble]. + +At some point he decided that there is a clear problem with lawn tennis tournaments. He observed that with a very high probability, if you have say a tournament with 64 players, the guy who gets the second prize is actually not the second strongest. @@ -24,44 +24,58 @@ He observed that the strongest and second strongest could be paired in the first round. Therefore the second strongest guy gets eliminated and doesn't get the second prize, in spite of his prowess. -This is is why now they use a technique known is [seeding][seed] to assure +This is why they now use a technique known as [seeding][seed] to assure that people of similar ability are spread out to different parts of the tree. -But, he wanted to come up with an algorithm -which assures that the second guy is truly the second guy. -He published it in 1883. -The algorithm wasn't quite an algorithm and it was clearly not optimal. -It took 15 more years before the problem was stated correctly. -People realized that you could talk about minimum +But he wanted to come up with an algorithm +which assures that the second-placed guy is truly the second best player. +He published it in 1883[^lawn-tennis-article]. +The algorithm wasn't quite an algorithm[^not-an-algorithm] and it was clearly not optimal. +It took 50 more years before the problem was stated correctly. +People realized that you could talk about the minimum number of comparisons but it took another thirty years, until 1964 when a Russian mathematician [Sergei S. Kislitsyn][sergei] published a paper which proved there is an optimal algorithm and described it. -By the way, all of this information is available[^aoc-ref] in -a -book called ["The Art of Computer Programming"][aoc] by [Donald Knuth][knuth] . +By the way, all of this information is available in a +book called ["The Art of Computer Programming"][aoc] by [Donald Knuth][knuth][^aoc-second-smallest-ref]. You really should buy it. You make a certain commitment. You spend $150 saying that you really care about programming. -This is not a book which is "useful" meaning it's not "programming in Python for idiots", +This is not a book which is "useful", meaning it's not "programming in Python for idiots", or "information retrieval for 21st century" or something like that. This is one of the fundamental books which you buy and then spend your lifetime getting the information out of it. -You get beautiful things which you then could use for programming in Python or information retrieval or -other things. +You get beautiful things which you then could use for programming in Python or information retrieval or other things. I'm going to be mentioning Knuth throughout the course. It's not a perfect book, it is just the greatest book we've got. Some people think it's a good reference book. No, it's not, because you have to basically do linear search to find what you're interested in. -Another important thing, do not spend too much time solving problems. +Another important thing; do not spend too much time solving problems. Read the solutions. -They are right at the end. +They are right at the end. Lots of very important algorithms are described in the solutions to his problems. Reading Knuth has to become a lifelong activity. -[^aoc-ref]: See chapter 5.3.3 in Volume 3 of "The Art of Computer Programming". +[^aoc-second-smallest-ref]: See chapter 5.3.3 in Volume 3 of "The Art of Computer Programming". -[alice]: https://www.gutenberg.org/ebooks/11 +[^not-an-algorithm]: Knuth: it is not formulated precisely enough to qualify as an algorithm. + +[^lawn-tennis-article]: "Lawn tennis tournaments; the true method of assigning prizes with a proof of + the fallacy of the present method". London, Macmillan and co., 1883. + +[^inventor-of-scrabble]: The invention of Scrabble is attributed to Lewis Carroll's brief journal entry: + "A game might be made of letters, to be moved about on a chess board till they form words" (Dec 19th, + ["The Life and Letters of Lewis Carroll"](http://www.fullbooks.com/The-Life-and-Letters-of-Lewis-Carroll3.html) ). + See also ["History of Scrabble"](https://scrabbledaily.blogspot.com/2008/05/history-of-scrabble.html) + and ["The games of Lewis Carroll"](http://www.bananagrammer.com/2009/10/games-of-lewis-carroll.html). + + His book ["The Game of Logic"](https://en.wikipedia.org/wiki/The_Game_of_Logic) teaches formal logic + using a board game. It is also available from [Project Gutenberg](https://www.gutenberg.org/ebooks/4763). + +[^alice-free-ebook]: The book is freely available from [Project Gutenberg](https://www.gutenberg.org/ebooks/11). + +[alice-in-wonderland]: https://en.wikipedia.org/wiki/Alice%27s_Adventures_in_Wonderland [carroll]: https://en.wikipedia.org/wiki/Lewis_Carroll [scrabble]: http://www.bananagrammer.com/2009/10/games-of-lewis-carroll.html [seed]: https://en.wikipedia.org/wiki/Seed_(sports) @@ -69,76 +83,79 @@ Reading Knuth has to become a lifelong activity. [aoc]: https://en.wikipedia.org/wiki/The_Art_of_Computer_Programming [sergei]: http://www.mathnet.ru/eng/person27000 [carroll-logic]: https://www.gutenberg.org/ebooks/28696 +[lawn-tennis]: https://en.wikipedia.org/wiki/History_of_tennis ## Smallest and second smallest element - -The problem of finding the smallest and second smallest element, -is in "The Art of Computer Programming", but somehow Knuth does not implement it. -He describes it fully but doesn't implement it as an algorithm. -As we shall see its actually a little tricky. -I pick this algorithm not because it is of paramount importance for your future work. -I pick this algorithm because it allows us to learn how to do decomposition -and learn components along the way (like lists). +The problem of finding the smallest and second smallest element +is described fully in "The Art of Computer Programming", but somehow Knuth does not implement it. +As we shall see it's actually a little tricky. How many comparisons do you need to solve this problem? -Same as min-max from last time? +Same as `minmax_element` from last time? No. You can do it in fewer. -Let us try to use some logic . -How many comparisons do we need to find the winner of the -tournament? `n - 1`. -It is necessary to find the winner, in order to find the second place guy. -We could sketch a proof. -Let us assume there are two potential guys greater than him. -If there is none, he isn't second place. -If there are two, he isn't second place either. - -What do we know about second place, -specifically the games he lost. +Let us try to use some logic. +How many comparisons do we need to find the winner of the tournament? `n - 1` +because it is necessary to find the winner in order to find the second place guy. +We could sketch a proof of this. +Let us assume there are two potential guys greater than the second place guy. +If neither are greater than him, he is first place, not second place. +If both of them are, he isn't second place. + +What do we know about the second place guy, specifically about the games he lost? *He only lost one game, and it was to the winner*. -If the winner remembers all the games he won, -how could we determine second place? -How many people did he beat? +This is a very important property which tells us why we don't need to do many comparisons. +If the winner remembers all the games he won, and who he played how do we find second place? +We determine the best from the subset of players he beat. +How many people does the winner beat to win? For example, [Wimbledon][wimbledon] has 64 players who are admitted. The winner doesn't play 63 games. -The tournament is structured as a binary tree. +The tournament is structured as a [binary tree][binary-tree]. This tree is how deep? If you have `n` elements? It's `ceil(log_2(n))`. We could somehow arrange our tournament so that the winner will defeat this many people. + + winner: d + second place: a or c + + d + / \ + a d + /\ /\ + / \ / \ + a b c d ceil(log_2(4)) = 2 + Now that we have a list of the `ceil(log_2(n))` people who played the winner, -we just find the best out of them which is -`ceil(log_2(n)) - 1` comparisons. +we just find the best out of them which is `ceil(log_2(n)) - 1` comparisons. -To review, we -have `n - 1` comparisons to get the winner. -`ceil(log_2(n)) - 1` to get the second best, -from the list he played. -So the upper bound for the algorithm is: +To review, we have `n - 1` comparisons to get the winner. +`ceil(log_2(n)) - 1` to get the second best, from the list he played. +So an upper bound on the comparisons for the algorithm is: n + ceil(log_2(n)) - 2 Actually implementing this algorithm effectively will require us to create several components. -We will build these up over the next few lessons. [wimbledon]: https://en.wikipedia.org/wiki/The_Championships,_Wimbledon +[binary-tree]: https://en.wikipedia.org/wiki/Binary_tree -### Unoptimal divider and conquer approach +### What about divide and conquer? It might appear you could use divide and conquer. -First split it in two, -find min and second min of the first half, +First split the list of elements in two, +find the min and second min of the first half, and the second half, and then merge them together doing two comparisons. It sounds very elegant because it's all recursive. But, let us think about how many comparisons it's going to do -with simple mathematics. +using simple mathematics. -1. We start with `n` we need to pair and compare them. - So the first round is `n/2` comparisons. +1. We start with `n`. We need to pair and compare them, + so the first round is `n/2` comparisons. 2. In the second round we pair up the results, so we need `n/4` "games". @@ -147,22 +164,21 @@ with simple mathematics. 3. And so on... -So that would be: +So the total number of comparisons would be: - n/2 + (n/2 + n/4 + n/8 + ...) - + n/2 + (n/2 + n/4 + n/8 + ...) = n/2 + n-1 + = 3n/2 - 1 -total comparisons. -That's not what we're trying to accomplish, -so divide and conquer doesn't always do what we think. +That's not what we're trying to accomplish, so divide and conquer doesn't always do what we think. ### Tournament tree shapes -First, we need rearrange the tournament we play. +To get the number of comparisons that we want, we need to rearrange the tournament we play. Right now `min_element` plays a tree structure that looks like this: - Unbalanced Tree + unbalanced tree + /\ /\ /\ @@ -170,11 +186,12 @@ Right now `min_element` plays a tree structure that looks like this: /\ It has `n - 1` internal nodes. -But, we don't want the winner to play `n - 1` matches. -We need to transform that into the way they play Tennis. +But we don't want the winning element to be compared `n - 1` times. +We need to transform that into the way they play tennis tournaments. We need to balance the tree. - Balanced Tree + balanced tree + / \ /\ /\ ... @@ -182,143 +199,178 @@ We need to balance the tree. How do we do it? One way is to just pair up elements and build up. -But then we need lots of memory to save the intermediate results. -What do we mean when we say lots of memory? -`O(n)` is bad, `O(sqrt(n))` is pretty bad. - -Note that once a bottom-level round has been played, -they are ready to move up. -Our goal is to basically to become eager. -Whenever guys are ready to be paired we want to pair them. -So if we only store only the winner at each level, -we never need to store `log(n)` things. +But then we need lots of memory to save the intermediate results[^early-ref-to-inplace]. +Note that once a bottom-level round has been played, they are ready to move up. +Our goal is basically to become eager. +Whenever elements are ready to be paired together, we want to pair and compare them. + +So if we store only the winner at each level, we only need to store `log(n)` things. We can define the **power** of each element to be the number of games they have played. Realize that suddenly we see something which has nothing to do with our problem. -*The foundation of our algorithm is the ability to take a tree like -the linear (unbalanced) tree and transform it into a balanced tree*. +*The foundation of our algorithm is the ability to take a tree, like +the linear (unbalanced) tree, and transform it into a balanced tree*. What mathematical property allows us to do such a transformation? -Specifically why can we convert one kind of computation to the other. -Associativity[^associativity]. +Why can we convert one kind of computation to the other? +**Associativity**[^associativity]. As long as our operation is associative, -What property don't we need? Commutativity. -We keep them in the same order, -we just rebalanced parenthesis. +what property don't we need? **Commutativity**[^commutativity]. +We keep the elements in the same order, +we're just rebalancing parentheses[^min-not-commutative]. -If you think about it, -our `min` is not quite commutative. -In mathematics `min` is commutative. -But, because we want to preserve stability it is not. -We distinguish between the left and right argument. +[^early-ref-to-inplace]: Alex: What do we mean when we say lots of memory? + `O(n)` is bad, `O(sqrt(n))` is pretty bad. + See the definition of "in-place memory usage" at the end of the chapter. +[^min-not-commutative]: Alex: If you think about it, + our `min` is not quite commutative. + In mathematics `min` is commutative. + But because we want to preserve stability, it is not. + We distinguish between the left and right argument. -[^associativity]: [Associativity][associative] is a fundamental property studied in abstract - algebra. - Informally, it is that you can apply an operation to arguments - in any order you want. - Formally: +[^associativity]: A binary function `f` is [associative](https://en.wikipedia.org/wiki/Associative_property) + if the following holds for all `a, b, c` in its domain: f(f(a, b), c)) = f(a, f(b, c)) - For example, multiplication: + Informally, `f` can be applied in any order. + For example, addition of integers is associative: + + (a + b) + c = a + (b + c) + + Subtraction of integers is not associative: - (a * b) * c = a * (b * c) + (3 - 2) - 1 != 3 - (2 - 1) You can see why associativity is often referred to as being - able to "re-paranthesize". + able to "re-parenthesize" expressions. + +[^commutativity]: A binary function `f` is [commutative](https://en.wikipedia.org/wiki/Commutative_property) + if for all `a, b` in its domain: + + f(a, b) = f(b, a) + + Informally, `f` gets the same result, regardless of the order of the inputs. + For example, multiplication of integers is commutative: + + a * b = b * a + + In Chapter 9.1 of "From Mathematics to Generic Programming", Alex gives + a neat visual proof of this fact for integer multiplication: + + * * * + * * * * * * * * + * * * * * = * * * + * * * * * * * * + * * * + + Or as [Dirichlet][dirichlet] put it: "Whether you arrange soldiers in rows or columns, you still have the same number of soldiers". + + An example of an operation which is not commutative is string concatenation. -[associative]: https://mathworld.wolfram.com/Associative.html + "Hello, " + "World!" != "World!" + "Hello, " +[dirichlet]: https://en.wikipedia.org/wiki/Peter_Gustav_Lejeune_Dirichlet ## Binary counting and reduction Here we come to the amazing idea of how to do this transformation. -This is one of the most beautiful ideas which they -kept secret from you. +This is one of the most beautiful ideas which they kept secret from you. They should have taught it in high school. -But, they want to publish papers -themselves and not tell you the general mechanism. +But, they want to publish papers themselves and not tell you the general mechanism. -We can create a counter, and in every bit of this counter we're -going to keep a singleton. -In each bit we keep the person who had `n` victories. -We will never combine things unless they have the same weight/parity. +Let us assume we have elements of type `T` that need to be paired or combined in some way, +whether with `min`, `+`, `merge`, or any other associative operation on `T`. +What we can do is create an array called a "counter". -Initially the counter has `zero` in every entry: + index: 0 1 ... 31 + contents: x1 x2 ... x32 - 1 2 ... 32 (bits/singletons) - 0 0 ... 0 +The `nth` slot of the counter will store the element that has had `n` "victories" so far. +So if there is a guy in slot 0 he hasn't played any games yet. +If there is a guy in slot 2 he has won 2 games, and so on. +This structure will help us to only pair up elements that have the same power. + +The following example using `min` as the operation should make this clear. +Initially the counter has zero in every entry: + + initial counter + + index: 0 1 ... 31 + contents: 0 0 ... 0 Take a new guy `x` who has never played any games, -and you look at the guy in the first slot of the counter. +and look at the guy in the first slot of the counter. The existing guy is either zero or not. -If it's zero (`_`), put him in the counter. +If it's zero, put the new guy `x` in the counter at index `0` (he has not played any games). - 1 2 ... 32 1 2 ... 32 - 0 0 ... 0 >>> x 0 ... 0 + 0 1 ... 31 0 1 ... 31 + 0 0 ... 0 --> x 0 ... 0 +Now take another guy `y`. Since `x` is in the first slot of the counter, we must pair them up. +The winner moves on up to the next slot in the counter, +as they have now won a game. +So if `y` wins: -If it's not zero, he plays a game with the existing guy. -If he wins, he replaces the loser in the counter. + 0 1 ... 31 0 1 ... 31 + x 0 ... 0 --> 0 y ... 0 - 1 2 ... 32 1 2 ... 32 - y 0 ... 0 >>> x 0 ... 0 +Otherwise `x` wins: + 0 1 ... 31 0 1 ... 31 + x 0 ... 0 --> 0 x ... 0 -Otherwise, the existing guy has now won a game. -So he needs to be promoted to the next level, -he follows the same rules with the guy in that slot. -It's a carry propagation. +What if the index `1` slot was non-zero, after comparing `x` and `y`? +Then the guy there already won one game. +So, we must **carry propagate**[^adder-circuit]. +Repeat the same process all the way up the counter, until we find a slot which is zero. +What if the counter is full, and has no zero slots? +That's called an **overflow**. - 1 2 ... 32 1 2 ... 32 - y 0 ... 0 >>> 0 y ... 0 +We borrow terminology from binary integer arithmetic because +our counter works just like a binary integer counting up: -If we end up with a guy in slot 32, it's an overflow, -exactly like integer arithmetic. -What do we do? -Whenever we don't know to proceed, -do something sensible and let whomever uses it figure out what is a sensible thing -to do. -Return the carry[^carry]. + 0 0 0 + 1 0 0 + 0 1 0 + 1 1 0 + 0 0 1 + 1 0 1 + 0 1 1 + 1 1 1 +But instead of 0 and 1 in each slot or "bit" we have arbitrary elements that are combined with an associative operation. + +### Handling overflow + +What do we do if the counter overflows? +Whenever we don't know how to proceed, +do something sensible and let whomever uses it figure out what is a sensible thing to do. +Return the carry. +If the return is non-zero the programmer who called the counter will know +it overflowed and can decide what to do. +Maybe they will extend the counter or throw an error. +It's his business not ours. Let us be lazy. -The great success in -programming comes because there are lazy people who say, "I don't want to know now, +The great success in programming comes because there are lazy people who say, +"I don't want to know now, I'll find out later." Right now we are solving this problem. We have an associative binary operation of some kind - and what we discovered that if we have associativity, +on type `T` and what we discovered is if we have associativity, we can make this counter and it will work for us. -If you are familiar with [numerical analysis][numerics], -whenever you sum up large number you don't really want to -add small quantities to big quantitative. -Bad things happen to the errors[^errors]. -So, you could use the same device for balancing your addition. -If you want to implement [merge sort][merge-sort] you can use exactly the same device, since -merge is associative[^ryan]. -The idea with merge sort, is only want to merge lists if they are roughly the same length -and this helps you do it. - -When we become grownups we learn about advanced data structures, -such as [binomial forest][binomial]. -They use the same idea. -The counter helps us combine things only when they have the same weight. -nless they are of the same weight it's a general algorithmic technique - - -[^carry]: The terms **carry** and **carry propagation** - are usually associated with the algorithm of binary addition - and especially its implementation as - an electronic/logical circuit called an [adder][adder]. - - An adder has two inputs `a` and `b` for +[^adder-circuit]: The terms **carry** and **overflow** + are closely associated with the implementation of binary counting + or addition as an electrical circuit, called an [adder][adder]. + + A single bit adder has two inputs `a` and `b` for the two digits to add together. It outputs a digit `s` which is the digit to display for this place value. - There is supplementary output called a carry + There is a supplementary output called the carry, which is the amount that needs to move to the higher place value. @@ -331,7 +383,7 @@ nless they are of the same weight it's a general algorithmic technique 1 1 | 1 0 This adds a single digit, but we want to add entire numbers. - To, do so we add an additional input `l` for carried digits + To do so we add an additional input `l` for carried digits from lower place values. a b l c s @@ -343,7 +395,7 @@ nless they are of the same weight it's a general algorithmic technique 0 1 1 | 1 0 1 1 1 | 1 1 - Now we can chain `n` of them togeter to be able to add `n` digits + Now we can chain `n` of them together to be able to add `n` digits s_1 c_1-----| s_2 c_2---- ... | | | | | @@ -353,32 +405,31 @@ nless they are of the same weight it's a general algorithmic technique An **overflow** is when the last adder has a non-zero carry. - -[^ryan]: When I first learned about this algorithm, I showed a friend - and he immediately had the idea to use it - for merge sort, without any hint this was possible! +[^floating-point-error]: Alex is referring to the following issue: + When floating point numbers become very large they can no longer represent small nearby increments. + So if many small numbers are accumulated in order, the sum may grow large enough to ignore the contribution of any particular element. + To see for yourself, observe that: + + double x = 1.45 * pow(2, 60); + assert(x == x + 1.0); -[^errors]: Numerical floating point calculation - is a subtle subject, but the basic issue Alex is - referring to is straightforward. - Floating point stores large numbers as decimals to some power. - If you add a small decimal to that, it may not change it all. - To try for yourself, compare the output of: + This is because floats are represented as a base and exponent, and the magnitude of the exponent constrains the precision. - (152500.0 * 5000.0) + If we forget about computers, and think about science or engineeing, it's natural to consider: - and + 1.45 * 2^60 + 1 ~= 1.45 * 2^60 - (152500.0 * 5000.0) + 0.01 + because the `1` is not very significant to the overall magnitutde. + [adder]: https://en.wikipedia.org/wiki/Adder_(electronics) -[numerics]: https://en.wikipedia.org/wiki/Numerical_analysis [merge-sort]: https://en.wikipedia.org/wiki/Merge_sort [binomial]: https://en.wikipedia.org/wiki/Binomial_heap ### Implementation +Now we have to write the code. The first function will add an element to the counter using the process we just described. @@ -405,12 +456,18 @@ it is evaluated in. Elements in the counter got there before, so they were to the left of the element we are inserting. -Notice that zero is `const` reference because we don't plan to modify it, +Notice that zero is `const T&` reference because we don't plan to modify it, but we do modify carry, so it should be passed by value. -The second function applies the operation -to all the elements left sitting in the counter. +### Reduction + +After we finish adding all our elements to the counter, they might not all be reduced to one element. +There may be several elements left sitting at various levels of the counter. +We need to do one more pass of the operation to combine them into the final result. + +This second function does that. It applies the operation, +in the same manner to the elements left sitting in the counter. template // requires Op is BinaryOperation(T) @@ -436,11 +493,7 @@ Sometimes it will work with the operation so apply `op(x, zero)` gives you `x`, but sometimes that won't happen. So we can't really initialize to zero. -**Exercise:** Use these functions to sum up an array of `double`s. - -**Exercise:** Rewrite `min_element` using these functions (just `min_element`, don't worry about second best). - -## Binary counter object +## Binary counter class ### Start with algorithms @@ -448,7 +501,7 @@ We want to take these two algorithms and combine them into an object. We have two beautiful algorithms but they're stateless. Everything is outside, and this is what we should always do, -no matter what we has been taught in software engineering classes by very wise +no matter what we have been taught in software engineering classes by very wise object-oriented professors. You start with algorithms. You don't start with objects. @@ -456,23 +509,23 @@ Figure out what you're going to do first. But you don't have to stop there. Because you can then put things together into an object. -It's very easy when you write an algorithm to -have a minimal iterator interface. -They externalize the counter. +It's very easy when you write an algorithm to have a minimal iterator interface. +In this case, the iterators externalize the counter. We say, "we don't want to know about him, we're just algorithms people". -We assume the principle that we will have no state for about +We assume the principle that we will have no state for about five minutes, and stay very functional, but then turn around and deal with state. We look at the whole thing. + ### Counter storage So what do we think is the state of this counter? How should we store the counter? Don't we have millions of elements to reduce? The counter is size `log(n)` which will -never be greater than 64. +never be greater than `64`. So, it's actually a small fixed size. We will store it in a `std::vector`. @@ -501,20 +554,32 @@ We will store it in a `std::vector`. } }; -Counter is private, we don't +`counter` is private, we don't want people to muck up our counter. -Same with our operation +Same with our operation. Using initializer lists instead of assignment in constructor is important. If you initialize in the body, it will first call default constructors for members, then you overwrite all the work with an assignment. I think it is very beautiful. -We could compete with Steve Jobs for elegance of our design[^alex-joke]. +We could compete with Steve Jobs for elegance of our design[^alex-apple-joke]. + +**Exercise:** In numerical analysis, whenever you sum up large numbers you don't really want to add small quantities to big quantities. + Bad things happen to the errors[^floating-point-error]. + Use this code to write a function which sums arrays of `double`. + +**Exercise:** Rewrite `min_element` using this code (just `min_element`, don't worry about second best). + +**Exercise:** If you want to implement [merge sort][merge-sort] you can use exactly the same device, since merge is associative. The idea with merge sort, is you only want to merge lists if they are roughly the same length and this helps you do it (see Chapter 12). + Write the associative binary operation `merge` which can combine two sorted arrays into a sorted array. -[^alex-joke]: Alex: Maybe we should make it in China. +**Exercise:** When we become grownups we learn about advanced data structures, such as [binomial forest][binomial]. They use the same idea. Learn about this data structure and try to figure out where + the counter could be used. + +[^alex-apple-joke]: Alex: Maybe we should make it in China. That's a necessary prerequisite for beautiful design. - [Designed in Cupertino][designed-by-apple] assembled in China. + [Designed in Cupertino][designed-by-apple], assembled in China. So let's try to assemble our machine in Palo Alto. [designed-by-apple]: https://signalvnoise.com/posts/2710-designed-by-apple-in-california @@ -522,11 +587,11 @@ We could compete with Steve Jobs for elegance of our design[^alex-joke]. ### What is in-place memory usage? How significant is the storage of our counter? -We use the term ["in-place"][in-place] to indicate the memory +We use the term [**in-place**][in-place] to indicate the memory usage of an algorithm is not significant. A long time ago people thought -algorithms were in-place if they didn't require any extra memory. -Then they decided constant memory was enough. +algorithms were in-place if they didn't require any extra memory. +Then they decided constant memory was enough. But, then they said it doesn't quite work because our favorite algorithm doesn't work. Of course the most important algorithm is [quicksort][quicksort]. It's not in-place because it's recursive. @@ -534,10 +599,9 @@ Recursive algorithms tend to require `O(log(n))`. Quicksort splits roughly in the middle let's assume on average. So people say `log(n)` is good so that will count as in-place. -Then they said , "what if we have nested things. -So is `log(n)^2` ok"? +Then they said, "what if we have nested things?" +So is `log(n)^2` ok? In our universe `log(n) <= 64` so this is 4096. - Then theoreticians said it's really alright if we have "poly-logarithmic" storage Basically as long as the memory requirement is `O(p(log(n)))` where `p` is a polynomial, it's alright. @@ -547,8 +611,12 @@ but let's go with "poly-logarithmic" being "in-place". [quicksort]: https://en.wikipedia.org/wiki/Quicksort [in-place]: https://en.wikipedia.org/wiki/In-place_algorithm + ## Code - [binary_counter.h](code/binary_counter.h) +- [test_binary_counter.cpp](code/test_binary_counter.cpp) + + diff --git a/11_min_1_2.html b/11_min_1_2.html index 593cc9a..91e29dc 100644 --- a/11_min_1_2.html +++ b/11_min_1_2.html @@ -2,116 +2,9 @@ + 11. Smallest and second-smallest element - + @@ -130,8 +23,8 @@

    11. Smallest and second-smallest element

    - -

    Program design approach

    + +

    Write code backward

    You all learned that the first thing you do when programming is define abstract things, then do specific things. @@ -151,32 +44,55 @@

    Program design approach

    So you can design those after. All the best programmers are lazy. If they were not lazy, they would do work with their hands. -They invented programming languages to be lazy. -Imitate them.

    +They invented programming languages to be lazy. Imitate them.

    + + +

    Overview

    + +

    We will call the function which finds the smallest and second smallest element min_element1_2. +Note that I picked this algorithm not because it is of paramount importance +for your future work. +I pick this algorithm because it allows us to learn how to do decomposition +and learn these components like list_pool and binary_counter along the way.

    + +

    Let me sketch the grand plan of the whole algorithm.

    + +
      +
    1. We already showed that we want to arrange our comparisons like a tennis tournament, +and binary_counter helps us do this. +Instead of comparing by left reduction, we compare by balanced reduction.

    2. +
    3. We also want to keep a history for each element of all +the other elements which they have beat. +This history will be used to determine the second-place guy.

      + +

      We will store this history in a list (using list_pool) +along with each element in the binary counter. +Note that the counter works on generic elements, so it doesn’t +need to be modified to know about this history tracking.

    4. +
    + + +

    From where we are now, it should only take 4-5 lines of code +to write min_element_1_2 along with type scaffolding.

    Combining binary counter and list pool

    -

    We built a machine for doing reduction. -We also built a pool for very fast lists. -Now we have to combine all that so it produces the final algorithm. -From where we are now, it should only take 4-5 lines of code -to write min_element_1_2 -along with type scaffolding.

    -

    Inner loop

    -

    Let us imagine you have all the materials to build it (we don’t) -and let’s discuss the main loop:

    +

    To start, let us imagine you have all the materials to build it (we don’t) +and discuss the main loop:

      -
    1. We will do a while loop and add things to a counter.
    2. -
    3. We will reduce the counter. -The result will have the minimum in the list.

    4. -
    5. Use std::min_element to find the second place element - in the list of losers.

    6. -
    7. Take the result of 2 and 3 and combine them in a pair.

    8. +
    9. Do a while loop over a range of elements and add them to a binary_counter.

      + +

      Actually we will store iterators pointing to the elements, rather than the elements themselves +so we can return all the useful information.

    10. +
    11. Reduce the counter. The result will be the minimum element.

    12. +
    13. The winner will also have a list of other elements it was compared with. +Use std::min_element to find the second place element in the list of losers.

    14. +
    15. Take the result of 2 and 3 and combine them in a pair. Return it.

    @@ -195,20 +111,20 @@

    Inner loop

    these are the only instruction generating lines2:

    Before the loop we need to define these objects and types. -Let’s construct our counter. Do we know it’s type? No. -That’s ok call it counter_type.

    +Let’s construct our counter. Do we know its type? No. +That’s ok, call it counter_type.

    counter_type counter(op, std::make_pair(last, pool.empty_queue()));
     

    We need a counter operation. Do we know its type? No. -Do the lazy thing call it op_type.

    +Do the lazy thing, call it op_type.

    op_type op(cmp, pool);
     
    -

    Now define the pool. We do know it’s type:

    +

    Now define the pool. We do know its type:

    list_pool<I, std::size_t> pool;
     
    @@ -216,7 +132,7 @@

    Inner loop

    Notice that we use std::min_element on our list pool. Will that work? Yes, because we added iterators to our list pool. -Define our iterator type;

    +Define our iterator type:

    typedef typename list_pool<I, std::size_t>::iterator iterator;
     
    @@ -261,7 +177,7 @@

    Reduction operation

    We will define a reduction function object to be used in the binary counter to find the min. -What it will do is apply a comparison operation between two elements cmp(a, b), +What it will do is apply a comparison operation between two elements cmp(a, b). When an element wins a comparison, the loser will be added to a list of elements which have lost to a. In other words, it will keep track of the elements which each element has beaten. @@ -292,7 +208,7 @@

    Reduction operation

    };
    -

    When an element wins, we can combine it’s list of losers +

    When an element wins, we can combine its list of losers with the element it beat, due to transitivity. We want this operation to be stable, so we need to be careful with the order in which the losers are stored. @@ -314,7 +230,7 @@

    Finishing the scaffolding

    typedef op_min1_2<I, size_t, compare_type> op_type; typedef binary_counter<op_type> counter_type; - typedef typename op_type::argument_type result_type; + typedef typename op_type::argument_type result_type; // ... } @@ -327,8 +243,7 @@

    Finishing the scaffolding

    This sense of wonder does not disappear.

    Exercise: Implement this algorithm in another language. - It will help you see language limitations and - understand the algorithm better.

    + It will help you see language limitations and understand the algorithm better.

    Why do we need the typename keyword?

    @@ -345,33 +260,39 @@

    Why do we need the typename keyword?

    };
    -

    Obviously what you are trying to do is extract T’s value type +

    Obviously what you are trying to do is extract T’s value_type and use it here. -Let us try to to follow the committees logic. +Let us try to follow the committees logic. The logic says maybe T::value_type refers to a static variable in T, which it could be of course. But, don’t you know from the context that it’s supposed to be a type? Since they are very well educated, they say, “but that will make our grammar context-sensitive4. We need to figure out the meaning of -a token without referring to the context in which it appears”.

    +a token without referring to the context in which it appears”.

    So they came up with the following rule: If you don’t put typename, the compiler must assume it is a variable, even if it is a type. This is done to maintain the property that you don’t need to know outside context. -Of course, the problem here does not really relate to typename. -The problem exists because T is not specified. +Of course, the problem here does not really relate to typename. +The problem exists because T is not specified. The language has no concepts. For example if we said Container T instead of class T, and had a concept Container, the definition of Container would say that it is required to have an affiliated type value_type. -Then the compiler could figure out what we really mean. -What often happens is that instead of +Then the compiler could figure out what we really mean.

    + +

    What often happens is that instead of solving the real problem, a partial problem is solved. We still do not have concepts. One of the great things about C++ is the language has been evolving for 40 years which is also one of the terrible -problems.

    +problems. +All its features have been added over time. +So, it works with all kinds of quirks.

    + +

    The advice Bjarne gives right now, is use typename whenever you can, +even in the context when it’s not absolutely required.

    Code

    @@ -391,9 +312,9 @@

    Code

    If you are confused refer to the final code at the end of the lesson.
  • -These are the only lines which generate +These are the only lines of code which generate assembly instructions for the CPU to execute. -All other lines are just to make the C++ type system work.
  • +All other lines of code are just to make the C++ type system work.
  • Much of the scaffolding can be removed in modern C++. Most of the ugly typename ... definitions diff --git a/11_min_1_2.md b/11_min_1_2.md index 338cdd6..c0040ff 100644 --- a/11_min_1_2.md +++ b/11_min_1_2.md @@ -1,7 +1,7 @@ 11. Smallest and second-smallest element ======================================== -## Program design approach +## Write code backward You all learned that the first thing you do when programming is define abstract things, then do specific things. @@ -21,31 +21,52 @@ For most algorithms you also need objects. So you can design those after. All the best programmers are lazy. If they were not lazy, they would do work with their hands. -They invented programming languages to be lazy. -Imitate them. +They invented programming languages to be lazy. Imitate them. -## Combining binary counter and list pool +## Overview + +We will call the function which finds the smallest and second smallest element `min_element1_2`. +Note that I picked this algorithm not because it is of paramount importance +for your future work. +I pick this algorithm because it allows us to learn how to do decomposition +and learn these components like `list_pool` and `binary_counter` along the way. + +Let me sketch the grand plan of the whole algorithm. + +1. We already showed that we want to arrange our comparisons like a tennis tournament, +and `binary_counter` helps us do this. +Instead of comparing by left reduction, we compare by balanced reduction. + +2. We also want to keep a history for each element of all + the other elements which they have beat. + This history will be used to determine the second-place guy. + + We will store this history in a list (using `list_pool`) + along with each element in the binary counter. + Note that the counter works on generic elements, so it doesn't + need to be modified to know about this history tracking. -We built a machine for doing reduction. -We also built a pool for very fast lists. -Now we have to combine all that so it produces the final algorithm. From where we are now, it should only take 4-5 lines of code -to write `min_element_1_2` -along with type scaffolding. +to write `min_element_1_2` along with type scaffolding. + +## Combining binary counter and list pool ### Inner loop -Let us imagine you have all the materials to build it (we don't) -and let's discuss the main loop: +To start, let us imagine you have all the materials to build it (we don't) +and discuss the main loop: + +1. Do a `while` loop over a range of elements and add them to a `binary_counter`. + + Actually we will store iterators pointing to the elements, rather than the elements themselves + so we can return all the useful information. -1. We will do a `while` loop and add things to a counter. -2. We will reduce the counter. - The result will have the minimum in the list. +2. Reduce the counter. The result will be the minimum element. -3. Use `std::min_element` to find the second place element - in the list of losers. +3. The winner will also have a list of other elements it was compared with. + Use `std::min_element` to find the second place element in the list of losers. -4. Take the result of 2 and 3 and combine them in a pair. +4. Take the result of 2 and 3 and combine them in a pair. Return it. Now let's start writing it, even though we don't have all the parts. @@ -61,25 +82,25 @@ We will have to adjust it, but these are the only instruction generating lines[^instruction-generating]: Before the loop we need to define these objects and types. -Let's construct our counter. Do we know it's type? No. -That's ok call it `counter_type`. +Let's construct our counter. Do we know its type? No. +That's ok, call it `counter_type`. counter_type counter(op, std::make_pair(last, pool.empty_queue())); We need a counter operation. Do we know its type? No. -Do the lazy thing call it `op_type`. +Do the lazy thing, call it `op_type`. op_type op(cmp, pool); -Now define the pool. We do know it's type: +Now define the pool. We do know its type: list_pool pool; Notice that we use `std::min_element` on our list pool. Will that work? Yes, because we added iterators to our list pool. -Define our `iterator` type; +Define our `iterator` type: typedef typename list_pool::iterator iterator; @@ -93,9 +114,9 @@ to write, but we are sort of done. If you are confused refer to the final code at the end of the lesson. -[^instruction-generating]: These are the only lines which generate +[^instruction-generating]: These are the only lines of code which generate assembly instructions for the CPU to execute. - All other lines are just to make the C++ type system work. + All other lines of code are just to make the C++ type system work. ### Comparing iterator values @@ -129,7 +150,7 @@ lines: We will define a reduction function object to be used in the binary counter to find the `min`. -What it will do is apply a comparison operation between two elements `cmp(a, b)`, +What it will do is apply a comparison operation between two elements `cmp(a, b)`. When an element wins a comparison, the loser will be added to a list of elements which have lost to `a`. In other words, it will keep track of the elements which each element has beaten. @@ -159,7 +180,7 @@ This list of "losers" associated with each element is stored in a `list_pool`. } }; -When an element wins, we can combine it's list of losers +When an element wins, we can combine its list of losers with the element it beat, due to transitivity. We want this operation to be stable, so we need to be careful with the order in which the losers are stored. @@ -180,7 +201,7 @@ we can define the final missing types and the signature: typedef op_min1_2 op_type; typedef binary_counter counter_type; - typedef typename op_type::argument_type result_type; + typedef typename op_type::argument_type result_type; // ... } @@ -192,9 +213,7 @@ There is quite a lot of complexity going on. This sense of wonder does not disappear. **Exercise:** Implement this algorithm in another language. - It will help you see language limitations and - understand the algorithm better. - + It will help you see language limitations and understand the algorithm better. [^auto]: Much of the scaffolding can be removed in modern C++. @@ -217,33 +236,39 @@ brought up this example: typedef T::value_type value_type; }; -Obviously what you are trying to do is extract `T`'s value type +Obviously what you are trying to do is extract `T`'s `value_type` and use it here. -Let us try to to follow the committees logic. +Let us try to follow the committees logic. The logic says maybe `T::value_type` refers to a static variable in `T`, which it could be of course. But, don't you know from the context that it's supposed to be a type? Since they are very well educated, they say, "but that will make our grammar [context-sensitive][context-free][^languages]. We need to figure out the meaning of -a token without referring to the context in which it appears". +a token without referring to the context in which it appears". So they came up with the following rule: If you don't put `typename`, the compiler must assume it is a variable, even if it is a type. This is done to maintain the property that you don't need to know outside context. -Of course, the problem here does not really relate to typename. -The problem exists because `T` is not specified. +Of course, the problem here does not really relate to `typename`. +The problem exists because `T` is not specified. The language has no concepts. For example if we said `Container T` instead of `class T`, and had a concept `Container`, the definition of `Container` would say that it is required to have an affiliated type `value_type`. Then the compiler could figure out what we really mean. + What often happens is that instead of solving the real problem, a partial problem is solved. We still do not have concepts. One of the great things about C++ is the language has been evolving for 40 years which is also one of the terrible problems. +All its features have been added over time. +So, it works with all kinds of quirks. + +The advice Bjarne gives right now, is use `typename` whenever you can, +even in the context when it's not absolutely required. [^languages]: This terminology is specific to compilers and theory of computation. It refers to a classification diff --git a/12_merge_sort.html b/12_merge_sort.html index c7c0714..20ed142 100644 --- a/12_merge_sort.html +++ b/12_merge_sort.html @@ -2,116 +2,9 @@ + 12. Merge Sort - + @@ -148,22 +41,22 @@

    The discovery of generic programming

    A friend of mine who worked at this Institute recommended me to this group. I knew I was going to go -there for for an interview and I’d have to say some wonderful things or they +there for an interview and I’d have to say some wonderful things or they would not give me a job. -Right before my interview, terrible thing happened. +Right before my interview, terrible thing happened. I ate some raw fish and it was very tasty but within - eight hours my temperature was 103. +eight hours my temperature was 103. I was in a bad situation.

    I’m in the hospital flying above my bed. That happens to you when you have high fever. -But, what I’m thinking about is this upcoming interview about parallel computers. +But, what I’m thinking about is this upcoming interview about parallel computers. I know nothing about parallel computers not now, nor ever, but I really want the job. So I need to come up with some brilliant idea. So I’m floating above the bed in space and thinking. -Then I suddenly realize how to add four numbers in parallel I said oh I could add them -in parallel if I add the first two and the second two in parallel:

    +Then I suddenly realize how to add four numbers in parallel I said, “oh I could add them +in parallel if I add the first two and the second two in parallel”:

         +
         /  \
    @@ -178,7 +71,7 @@ 

    The discovery of generic programming

    Then I realized the second great thing, while still floating. It could be multiplication and the same thing will work. I started realizing more and more -functions will work min, max, but division will not work. +functions will work min, max, but division will not work. Then I realized it’s good to be sick. It has to do something which I almost forgot called abstract algebra. How is this related? @@ -187,12 +80,12 @@

    The discovery of generic programming

    I realized that this idea will work as long as the operation is associative.

    This became my central theme. -how could I talk about associative operations? -how could I write algorithms like that? +How could I talk about associative operations? +How could I write algorithms like that? When I came to United States somewhere along the way, maybe in Austria, I realized that merge was associative It was a very big deal because I never even thought about merge in terms of + and *, but merge is associative. -Then I realized I could do merge sort with parallel reduction. +Then I realized I could do merge sort with parallel reduction. That’s the summary. Alex, what have you been doing all your life? This. You might say, “that’s not good enough”. @@ -219,14 +112,14 @@

    Iterators as a concept

    1. Doing programming in mathematical theories
    2. -
    3. not losing efficiency
    4. +
    5. Not losing efficiency

    We are slowly hinting more at concepts. How is a concept different from a class? How is a concept different than an abstract class -or interface in java? +or interface in Java? One way to see this is try to think of how to specify the concept of an iterator using them. Consider that iterators have a value_type. @@ -245,7 +138,7 @@

    Iterators as a concept

    In Java you can kind of fix it by doing lots of casting. But, the other problem is more serious as you can’t fix the fundamental variance on value_type.

    -

    People say, “but couldn’t there be a correct inheritance? +

    People say, “but couldn’t there be a correct inheritance?” Of course, yes. But it’s not the inheritance we have in C++ or Java.

    @@ -268,7 +161,7 @@

    Kinds of iterators

    -

    Linked iterator

    +

    Linked iterator

    There are some people who say, “concepts aren’t important because Alex already exhausted them. there are just four kinds of iterators and that’s all.” @@ -288,7 +181,7 @@

    Linked iterator

    -

    It has one additional operation which allows +

    it has one additional operation which allows you to set the successor. If you have an iterator and you have another one, you can just make the second iterator the successor @@ -306,7 +199,7 @@

    Linked iterator

    We want it to be a function not a member because primitive/built-in -types don’t have member functions. +types don’t have member functions5. Furthermore, in the case of an iterator, it might be a pointer, not a class. In general, I don’t like member functions. @@ -321,7 +214,7 @@

    Linked iterator

    There is at least one provable advantage. -It has one fewer character5. +It has one fewer character6. But, since we think generically, a better question is whether the function could operate on a built-in type instead of a class.

    @@ -329,7 +222,7 @@

    Linked iterator

    Linked iterator is “unsafe”

    -

    As long as you don’t do it, the topology remains. +

    As long as you don’t do set_successor, the topology remains. But, if you do, things change. This is a very unsafe operation. You can even make circular lists. @@ -345,8 +238,8 @@

    Linked iterator is “unsafe”

    you will not be successful. But, it is a legitimate data structure.

    -

    A long time ago when we programming in Lisp, -there were people saying, “never use rplcd” (set successor). +

    A long time ago when we were programming in Lisp, +there were people saying, “never use set successor (rplcd)”. They were wrong. Use whatever is given to you. But, be wise. @@ -359,10 +252,10 @@

    Linked iterator is “unsafe”

    Well maybe just while (true) is bad.

    a = 1;
    -while (a > 0)  { do stuff }
    +while (a > 0) { do stuff }
     
    -

    There many ways of writing bad code +

    There are many ways of writing bad code and syntax can’t help you. No smart pointer or syntactic constraint will make a bad programmer into a good programmer. @@ -371,12 +264,12 @@

    Linked iterator is “unsafe”

    they find amazing things.

    -

    Reverse linked ranges

    +

    Reverse linked ranges

    To understand how this concept works we will look at a basic algorithm. If I hadn’t already shown you set_successor -we can learn it’s use from this algorithm. +we can learn its use from this algorithm. It takes two lists, reverses the first, and attaches it to the second. It’s a very important list algorithm. @@ -437,7 +330,7 @@

    Simple merge

    On every single iteration, because only the one iterator we moved in the loop could become empty.

    -

    Exercise: Write a theoretically more efficient merge which does not do this extra comparison (solved just below).

    +

    Exercise: Write an alternative merge which does not do this extra comparison every loop (solved just below).

    Merge with fewer comparisons

    @@ -448,7 +341,7 @@

    Merge with fewer comparisons

    Now I’m going to teach you to use goto. The greatest authority in computer science wrote a famous letter to communications of ACM called, “Go To Statement Considered Harmful”. -There is is nothing harmful in the computer. +There is nothing harmful in the computer. The computer is good. It has an instruction called JMP (or branch). If it’s there, why not use it?

    @@ -530,26 +423,26 @@

    Merge with fewer comparisons

    the successor of the final node. But, if we return the end, it’s probably nil so I don’t have it. -Just return you all the information. +You just return all the information. The caller can do whatever they please. If they want to ignore it, ignore it.

    Note that it assumes the lists are nonempty, which is perfectly fine for our sort, -which is not going to merge empty lists6.

    +which is not going to merge empty lists7.

    Can you write it without goto? -Not as efficiently7.

    +Not as efficiently8.

    Is it worth it?

    We now have two programs. One of them is oh so very simple. -The other one is elegant, but long and does a minimal number of operations. +The other one is elegant, but long and does a minimal number of operations. Which one should we use in practice? We need to do a lot of experiments to establish certainty -of what we are doing8.

    +of what we are doing9.

    Exercise: Measure and determine whether it’s actually faster than our simple merge.

    @@ -560,7 +453,7 @@

    Is it worth it?

    way a condition is going to go. If you look at our branches, the probability of going one way or the other is 50 percent. -That’s literally the worst thing which which could happen. +That’s literally the worst thing which could happen. I’m entering the territory where I haven’t done work yet. I haven’t tried see whether some kind of predication avoidance of goto could be done.

    @@ -578,7 +471,7 @@

    Merge sort

    It has 5 arguments, and we need a binary function.

    template <typename I, typename Compare>
    -// I is Linked Iteratork
    +// I is Linked Iterator
     struct mergesort_linked_operation
     {
       typedef I argument_type;
    @@ -609,18 +502,17 @@ 

    Merge sort

    }
    -

    It is very efficient, and doesn’t do any kind of list splitting9. +

    It is very efficient, and doesn’t do any kind of list splitting10. If you don’t find it’s beautiful I have nothing to teach you. We aren’t on the same wavelength. -In the modern world, you meet somebody -somebody and +In the modern world, you meet somebody and say, “Bach is a great composer”. He says, “nah, Lady Gaga is much more gifted”. It’s a free world. People are allowed to say whatever they like. -Some people will say, “oh this is not object-oriented”, or “this is not functional”, +Some people will say, “oh this is not object-oriented”, or “this is not functional”, which it isn’t and I’m very proud of it. -But, in some sense this is literally the essence of my life work, this piece of +But, in some sense this is literally the essence of my life’s work, this piece of code. That’s where it started. That’s where it ends. @@ -628,6 +520,13 @@

    Merge sort

    The majority of computer scientists do not get it. There is absolutely no indication that getting it will make you rich.

    +

    Exercise: Implement a visualizer (such as console output) which shows + the contents of the counter at each step of the merge algorithm.

    + +

    Exercise: Implement merge sort with std::accumulate (left fold) instead of binary counter. + This is a very inefficient merge sort, but can be helpful to understand how the binary + counter works.

    +

    Code

    @@ -653,38 +552,32 @@

    Code

    This is a subject for which the SGI Documentation can be a better resource.
  • -I suspect he is referring to Andrei Alexandrescu who gave a talk +Alex is likely referring to Andrei Alexandrescu who gave a talk “Iterators must Go”.
  • -

    This argument doesn’t hold up if there are arguments:

    - -
    x.foo(y)
    -
    - -

    Versus:

    +Other programming languages, such as Swift, allow you to extend primitive types like int with additional member functions.
  • +
  • +

    This argument doesn’t hold if the function requires more than one argument. +These two forms both require 8 characters to type:

    -
    foo(x, y)
    +
    x.foo(y)    foo(x, y)
     
    -

    I suppose you could drop the space:

    +

    Perhaps the space could be dropped:

    foo(x,y)
     
    -

    Many languages such as Swift now also allow you to add member functions -to primitive types like int.

  • -
  • -In “Elements of Programming”, Alex often follows the pattern -of creating a function which requires strict assumptions -(such as the list being nonempty). -Then he creates a wrapper -which does additional checks or work to ensure the assumptions -are met. -This makes the algorithm more modular and faster -for those other components which can guarantee the assumptions, -without doing extra work.
  • +

    However Alex doesn’t follow this convention.

  • -Alex: I have this in code going back to 1985. +

    In “Elements of Programming”, Alex often follows this pattern. +First he writes a function requiring many strict preconditions, for example a list must be nonempty. +Then he creates a wrapper function which does additional work to guarantee the preconditions are met.

    + +

    Removing all the special cases allows the core algorithm to be expressed concisely. +And it also becomes more modular, as other components can often guarantee a subset of the preconditions without doing extra work.

  • +
  • +Alex: I have this in code going back to 1985. I wrote it then in Scheme without goto, but it had other efficiency problems. Since then I have published the code, multiple times. @@ -692,20 +585,19 @@

    Code

    (“The Ada Generic Library Linear List Processing Packages”). I got such angry letters, especially from Holland, saying, “don’t you know that goto is harmful?”. -I couldn’t find another solution.
  • -
  • +I couldn’t find another solution.
  • +
  • In my test (Intel i5, 4 core, g++ 9.3.0) I noticed about a 15-20% improvement -sorting one million integers.
  • -
  • -

    Alex: If you want to see a really bad program -see Patrick Henry Winston’s -book “LISP 1st Edition”. -Look at his sorting algorithm for list: radix_sort.lisp. -It posses many remarkable properties including using n log(n) extra storage. -Good example of a famous person at a respectable school -publishing something terrible. -Published does not mean good.

  • +sorting one million integers. +
  • +

    Alex: If you want to see a really bad program see Patrick Henry Winston’s book “LISP 1st Edition”. +Look at his sorting algorithm for lists: radix_sort.lisp. +It possesses many remarkable properties including using n log(n) extra storage. +It shouldn’t need any extra storage. +It’s also slow. +Good example of a famous person at a respectable school publishing something terrible. +Published does not mean good.

  • diff --git a/12_merge_sort.md b/12_merge_sort.md index e1ebcc8..8f85ea9 100644 --- a/12_merge_sort.md +++ b/12_merge_sort.md @@ -18,22 +18,22 @@ I wanted this job really badly. A friend of mine who worked at this Institute recommended me to this group. I knew I was going to go -there for for an interview and I'd have to say some wonderful things or they +there for an interview and I'd have to say some wonderful things or they would not give me a job. -Right before my interview, terrible thing happened. +Right before my interview, terrible thing happened. I ate some raw fish and it was very tasty but within - eight hours my temperature was 103. +eight hours my temperature was 103. I was in a bad situation. I'm in the hospital flying above my bed. That happens to you when you have high fever. -But, what I'm thinking about is this upcoming interview about parallel computers. +But, what I'm thinking about is this upcoming interview about parallel computers. I know nothing about parallel computers not now, nor ever, but I really want the job. So I need to come up with some brilliant idea. So I'm floating above the bed in space and thinking. -Then I suddenly realize how to add four numbers in parallel I said oh I could add them -in parallel if I add the first two and the second two in parallel: +Then I suddenly realize how to add four numbers in parallel I said, "oh I could add them +in parallel if I add the first two and the second two in parallel": + / \ @@ -47,7 +47,7 @@ When you're sick amazing things happen in your mind. Then I realized the second great thing, while still floating. It could be multiplication and the same thing will work. I started realizing more and more -functions will work `min`, `max`, but division will not work. +functions will work `min`, `max`, but division will not work. Then I realized it's good to be sick. It has to do something which I almost forgot called abstract algebra. How is this related? @@ -56,12 +56,12 @@ there is this thing called associativity. I realized that this idea will work as long as the operation is associative. This became my central theme. -how could I talk about associative operations? -how could I write algorithms like that? +How could I talk about associative operations? +How could I write algorithms like that? When I came to United States somewhere along the way, maybe in Austria, I realized that merge was associative It was a very big deal because I never even thought about merge in terms of `+` and `*`, but merge is associative. -Then I realized I could do merge sort with parallel reduction. +Then I realized I could do merge sort with parallel reduction. That's the summary. Alex, what have you been doing all your life? This. You might say, "that's not good enough". @@ -103,12 +103,12 @@ things like that. So it's the combination of two things: 1. Doing programming in mathematical theories -2. not losing efficiency +2. Not losing efficiency We are slowly hinting more at concepts. How is a concept different from a class? How is a concept different than an abstract class -or interface in java? +or interface in Java? One way to see this is try to think of how to specify the concept of an iterator using them. Consider that iterators have a `value_type`. @@ -127,7 +127,7 @@ The second argument stays the same. In Java you can kind of fix it by doing lots of casting. But, the other problem is more serious as you can't fix the fundamental variance on `value_type`. -People say, "but couldn't there be a correct inheritance? +People say, "but couldn't there be a correct inheritance?" Of course, yes. But it's not the inheritance we have in C++ or Java. @@ -141,13 +141,13 @@ There are several [iterator concepts][cpp-iterator-concepts] in STL[^sgi], all of which we have mentioned before: - [`InputIterator`][cpp-input-iterator]: can only advance forward and once advanced, iterators pointing to previous elements become invalid. -- [`ForwardIterator`][cpp-forward-iteartor]: can advance forward and have iterators pointing to previous elements. +- [`ForwardIterator`][cpp-forward-iterator]: can advance forward and have iterators pointing to previous elements. - [`BidirectionalIterator`][cpp-bi-iterator]: can move iterators forward and backward. - [`RandomAccessIterator`][cpp-random-iterator]: can advance iterators by arbitrary steps in constant time (like pointers). [cpp-iterator-concepts]: https://en.cppreference.com/w/cpp/iterator [cpp-input-iterator]: https://en.cppreference.com/w/cpp/named_req/InputIterator -[cpp-forward-iteartor]: https://en.cppreference.com/w/cpp/named_req/ForwardIterator +[cpp-forward-iterator]: https://en.cppreference.com/w/cpp/named_req/ForwardIterator [cpp-bi-iterator]: https://en.cppreference.com/w/cpp/named_req/BidirectionalIterator [cpp-random-iterator]: https://en.cppreference.com/w/cpp/named_req/RandomAccessIterator @@ -155,7 +155,8 @@ all of which we have mentioned before: can be a better resource. [sgi-iterator-concepts]: https://www.boost.org/sgi/stl/Iterators.html -### Linked iterator + +## Linked iterator There are some people who say, "concepts aren't important because Alex already exhausted them. there are just four kinds of iterators and that's all." @@ -172,7 +173,7 @@ the normal operations: - `==`: equality - `*`: dereference -It has one additional operation which allows +it has one additional operation which allows you to set the successor. If you have an iterator and you have another one, you can just make the second iterator the successor @@ -183,14 +184,13 @@ Of course, the standard model is a linked list. To make our `list_pool` iterator a `LinkedIterator` we simply add the following function: - void set_successor(iterator x, iterator y) { x.pool->next(x.node) = y.node; } We want it to be a function not a member because primitive/built-in -types don't have member functions. +types don't have member functions[^member-functions-on-primitives]. Furthermore, in the case of an iterator, it might be a pointer, not a class. In general, I don't like member functions. @@ -203,36 +203,32 @@ Or foo(x) There is at least one provable advantage. -It has one fewer character[^character]. +It has one fewer character[^member-functions-more-typing]. But, since we think generically, a better question is whether the function could operate on a built-in type instead of a class. -[^andrei]: I suspect he is referring to [Andrei Alexandrescu](https://en.wikipedia.org/wiki/Andrei_Alexandrescu) who gave a talk +[^andrei]: Alex is likely referring to [Andrei Alexandrescu](https://en.wikipedia.org/wiki/Andrei_Alexandrescu) who gave a talk ["Iterators must Go"][iterators-must-go]. -[^character]: - This argument doesn't hold up if there are arguments: - - x.foo(y) +[^member-functions-more-typing]: This argument doesn't hold if the function requires more than one argument. + These two forms both require 8 characters to type: - Versus: + x.foo(y) foo(x, y) - foo(x, y) - - I suppose you could drop the space: + Perhaps the space could be dropped: foo(x,y) - Many languages such as Swift now also allow you to add member functions - to primitive types like `int`. + However Alex doesn't follow this convention. +[^member-functions-on-primitives]: Other programming languages, such as Swift, allow you to extend primitive types like `int` with additional member functions. [iterators-must-go]: https://accu.org/conf-docs/PDFs_2009/AndreiAlexandrescu_iterators-must-go.pdf ### Linked iterator is "unsafe" -As long as you don't do it, the topology remains. +As long as you don't do `set_successor`, the topology remains. But, if you do, things change. This is a very unsafe operation. You can even make circular lists. @@ -248,8 +244,8 @@ Of course, if you try to get to the end, you will not be successful. But, it is a legitimate data structure. -A long time ago when we programming in Lisp, -there were people saying, "never use `rplcd`" (set successor). +A long time ago when we were programming in Lisp, +there were people saying, "never use set successor (`rplcd`)". They were wrong. Use whatever is given to you. But, be wise. @@ -261,9 +257,9 @@ That doesn't terminate. Well maybe just `while (true)` is bad. a = 1; - while (a > 0) { do stuff } + while (a > 0) { do stuff } -There many ways of writing bad code +There are many ways of writing bad code and syntax can't help you. No smart pointer or syntactic constraint will make a bad programmer into a good programmer. @@ -271,12 +267,12 @@ It will just replace one sort of bugs with another. Bad programmers are very creative, they find amazing things. -## Reverse linked ranges +### Reverse linked ranges To understand how this concept works we will look at a basic algorithm. If I hadn't already shown you `set_successor` -we can learn it's use from this algorithm. +we can learn its use from this algorithm. It takes two lists, reverses the first, and attaches it to the second. It's a very important list algorithm. @@ -332,7 +328,7 @@ Also note that algorithmically there is no good reason why we evaluate On every single iteration, because only the one iterator we moved in the loop could become empty. -**Exercise:** Write a theoretically more efficient merge which does not do this extra comparison (solved just below). +**Exercise:** Write an alternative merge which does not do this extra comparison every loop (solved just below). ### Merge with fewer comparisons @@ -341,8 +337,8 @@ I tell you not to use inheritance, smart pointers, or read certain books. Now I'm going to teach you to use `goto`. The greatest authority in computer science wrote a famous letter to -communications of ACM called, ["Go To Statement Considered Harmful"][harmful]. -There is is nothing harmful in the computer. +communications of ACM called, ["Go To Statement Considered Harmful"][goto-harmful]. +There is nothing harmful in the computer. The computer is good. It has an instruction called [JMP][jump] (or branch). If it's there, why not use it? @@ -423,23 +419,23 @@ and attach it, I need to be able to modify the successor of the final node. But, if we return the end, it's probably `nil` so I don't have it. -Just return you all the information. +You just return all the information. The caller can do whatever they please. If they want to ignore it, ignore it. Note that it assumes the lists are nonempty, which is perfectly fine for our sort, -which is not going to merge empty lists[^non-empty]. +which is not going to merge empty lists[^non-empty-pattern]. Can you write it without `goto`? Not as efficiently[^without-goto]. [jump]: https://en.wikipedia.org/wiki/JMP_(x86_instruction) -[harmful]: papers/goto-harmful.pdf +[goto-harmful]: papers/goto-harmful.pdf -[^without-goto]: Alex: I have this in code going back to 1985. +[^without-goto]: Alex: I have this in code going back to 1985. I wrote it then in Scheme without `goto`, but it had other efficiency problems. Since then I have published the code, multiple times. @@ -449,23 +445,20 @@ Not as efficiently[^without-goto]. saying, "don't you know that `goto` is harmful?". I couldn't find another solution. -[^non-empty]: In "Elements of Programming", Alex often follows the pattern - of creating a function which requires strict assumptions - (such as the list being nonempty). - Then he creates a wrapper - which does additional checks or work to ensure the assumptions - are met. - This makes the algorithm more modular and faster - for those other components which can guarantee the assumptions, - without doing extra work. - +[^non-empty-pattern]: In "Elements of Programming", Alex often follows this pattern. + First he writes a function requiring many strict preconditions, for example a list must be nonempty. + Then he creates a wrapper function which does additional work to guarantee the preconditions are met. + + Removing all the special cases allows the core algorithm to be expressed concisely. + And it also becomes more modular, as other components can often guarantee a subset of the preconditions without doing extra work. + [ada]: https://en.wikipedia.org/wiki/Ada_(programming_language) ### Is it worth it? We now have two programs. One of them is oh so very simple. -The other one is elegant, but long and does a minimal number of operations. +The other one is elegant, but long and does a minimal number of operations. Which one should we use in practice? We need to do a lot of experiments to establish certainty of what we are doing[^test-result]. @@ -479,7 +472,7 @@ Modern processors try to predict which way a condition is going to go. If you look at our branches, the probability of going one way or the other is 50 percent. -That's literally the worst thing which which could happen. +That's literally the worst thing which could happen. I'm entering the territory where I haven't done work yet. I haven't tried see whether some kind of predication avoidance of `goto` could be done. @@ -502,7 +495,7 @@ There is no code, it's just a way of invoking our merge function. It has 5 arguments, and we need a binary function. template - // I is Linked Iteratork + // I is Linked Iterator struct mergesort_linked_operation { typedef I argument_type; @@ -534,15 +527,14 @@ Observe it is the same machine as we had before. It is very efficient, and doesn't do any kind of list splitting[^bad-code]. If you don't find it's beautiful I have nothing to teach you. We aren't on the same wavelength. -In the modern world, you meet somebody -somebody and +In the modern world, you meet somebody and say, "[Bach][bach] is a great composer". He says, "nah, [Lady Gaga][gaga] is much more gifted". It's a free world. People are allowed to say whatever they like. -Some people will say, "oh this is not object-oriented", or "this is not functional", +Some people will say, "oh this is not object-oriented", or "this is not functional", which it isn't and I'm very proud of it. -But, in some sense this is literally the essence of my life work, this piece of +But, in some sense this is literally the essence of my life's work, this piece of code. That's where it started. That's where it ends. @@ -550,14 +542,20 @@ The majority of people do not get it. The majority of computer scientists do not get it. There is absolutely no indication that getting it will make you rich. +**Exercise:** Implement a visualizer (such as console output) which shows + the contents of the counter at each step of the merge algorithm. + +**Exercise:** Implement merge sort with `std::accumulate` (left fold) instead of binary counter. + This is a very inefficient merge sort, but can be helpful to understand how the binary + counter works. + [^bad-code]: - Alex: If you want to see a really bad program - see [Patrick Henry Winston][winston]'s - book ["LISP 1st Edition"][lisp-book]. - Look at his sorting algorithm for list: [radix_sort.lisp](code/other/radix_sort.lisp). - It posses many remarkable properties including using `n log(n)` extra storage. - Good example of a famous person at a respectable school - publishing something terrible. + Alex: If you want to see a really bad program see [Patrick Henry Winston][winston]'s book ["LISP 1st Edition"][lisp-book]. + Look at his sorting algorithm for lists: [radix_sort.lisp](code/other/radix_sort.lisp). + It possesses many remarkable properties including using `n log(n)` extra storage. + It shouldn't need any extra storage. + It's also slow. + Good example of a famous person at a respectable school publishing something terrible. Published does not mean good. [lisp-book]: http://people.csail.mit.edu/phw/Books/LISPBACK.HTML diff --git a/13_searching.html b/13_searching.html index ee022fb..8270457 100644 --- a/13_searching.html +++ b/13_searching.html @@ -2,116 +2,9 @@ + 13. Searching - + @@ -139,10 +32,8 @@

    History of binary search

    He was the guy who invented first general purpose computer, but we don’t remember people like that. In 1946 he gave a brilliant series of -lectures at the Moore School at Pennsylvania University -on programming. -For the first time, -he described things like merge, merge sort, and binary search, +lectures at the Moore School at Pennsylvania University on programming. +For the first time, he described things like merge, merge sort, and binary search. This is not a bad thing to be the first person to describe. He designed ENIAC which should make him very famous. Indeed, he did some very fundamental work.

    @@ -150,67 +41,52 @@

    History of binary search

    Then comes this interesting fact (from “The Art of Computer Programming”) It takes about 15 years for people to come up with binary search which sort of works for all possible inputs. -Apparently people didn’t have trouble coding binary search when the length is -of the form 2^(n-1). -Because it’s easy, you take the middle element and -then both sides will be of the same form -and you can keep dividing. +Apparently people didn’t have trouble coding binary search when the length is of the form 2^(n-1). +Because it’s easy, you take the middle element and then both sides will be of the same form and you can keep dividing. Apparently people couldn’t do it. -Knuth claims that the first correct implementation was done by -D.H. Lehmer. -He is someone you should know about -as a very great computer scientist. +Knuth claims that the first correct implementation was done by D.H. Lehmer. +He is someone you should know about as a very great computer scientist. He did amazing amount of work on computational number theory, -things like sieves for discovering large primes and many other -important things. -Among other things, he published a binary search which at least -always terminated.

    +like sieves for discovering large primes and many other important things. +Among other things, he published a binary search which at least always terminated.

    I actually disagree with Knuth slightly and claim that the first correct binary search was published roughly at the same time, -but a couple of years after, -by a German computer science. +but a couple of years after, by a German computer scientist. Once again, he is unjustly forgotten. He does not appear on Wikipedia1. -His name is Herman Bottenbruch
    -His claim to fame is he was one of their people who invented Algol 58 the predecessor -of Algol 60. +His name is Herman Bottenbruch. +His claim to fame is he was one of the people who invented Algol 58, the predecessor of Algol 60. He is one of the people who tried unsuccessfully to convince American delegates to Algol 58 committee that they should introduce block structures. He was one of the inventors of blocks. American representatives which included such brilliant people as John Backus -and Alan Perlis actually reject it as too hard to implement. +and Alan Perlis actually rejected it as too hard to implement. They didn’t know how to do stacks. -But sadly enough he doesn’t get much credit, -especially credit for correct binary search. +But sadly enough he doesn’t get much credit, especially credit for correct binary search. We will be actually studying his version.

    bsearch is wrong

    If we think about merging two sequences of roughly the same length, -or rather exactly the same length n, the expected number of comparison -is going to be a to 2n - 1. -From which follows a conjecture. -If we have sequences -of size n and size m the number of comparisons should be n + m - 1. -Not every conjecture is true however, this one is definitely false. -Here is a simple counter example. -Take a sequence of length 1000 -and a sequence of length 1. -We only need log(1000) because -we can binary search for it’s index.

    - -

    So there is a fundamental possibility +or rather exactly the same length n, the expected number of comparisons is going to be 2n - 1. +From which follows a conjecture. +If we have sequences of size n and size m the number of comparisons should be n + m - 1. +Not every conjecture is true however. +This one is definitely false. +Here is a simple counterexample. +Take a sequence of length 1000 and a sequence of length 1. +We only need log(1000) because we can binary search for its index.

    + +

    So there is a fundamental possibility for using binary search for merging, dramatically reducing the number of comparisons. log(n) is much smaller than n.

    -

    You might think we can just use binary search from -a standard library, such as C bsearch(3). +

    You might think we can just use binary search from a standard library, such as C bsearch(3). Sounds like a plausible idea. It was written by great UNIX guys. -They know something about programming, so let us see what they provide us with, -by quoting the man page:

    +They know something about programming, so let us see what they provide us with (see man 3 bsearch):

    void* bsearch(
         const void* key,
    @@ -221,22 +97,19 @@ 

    bsearch is wrong

    );
    -

    Notice it takes all these parameters, -and it’s a little messy because it’s C. -It’s hard for them2. +

    Notice it takes all these parameters, and it’s a little messy because it’s C. +Components are hard for them2. Nevermind what it takes. What’s interesting is what it returns.

    returns a pointer to a matching member of the array, or NULL if no match is found.

    -

    So for our merge, it will most often return NULL . +

    So for our merge, it will most often return NULL. At which point, you will have to do linear search. -So observe, ancient interface, -done by brilliant people, -in the standard library and it’s utterly useless.

    +So observe, ancient interface, done by brilliant people, in the standard library, and it’s utterly useless.

    -

    Even if, we are so fortunte as to get a pointer to an element back. -Does it help with merge? Especially if we want to make it stable? +

    Even if we are so fortunate as to get a pointer to an element back, does it help with merge? +Especially if we want to make it stable? No.

    If there are multiple elements that match the key, the element returned is unspecified.

    @@ -249,75 +122,66 @@

    bsearch is wrong

    What is correct code?

    -

    Here comes another philosphical point. +

    Here comes another philosophical point. What does wrong mean? What does incorrect mean? -At school they told you that the program is incorrect when it doesn’t satisfy its specifications. -Well, then bsearch is a correct program. +At school they told you that the program is incorrect when it doesn’t satisfy its specifications.

    + +

    Well then bsearch is a correct program. I looked at the source, it does do what it promises to do. It will return NULL. I wish it were not correct. -I wished it returned something useful.

    +I wish it returned something useful.

    -

    Correctness is a deeper concept than -just satisfying specification. -Well in reality, as you guys know, -it must be deeper -because you haven’t got any specifications. +

    Correctness is a deeper concept than just satisfying specification. +Well in reality, as you guys know, it must be deeper because you haven’t got any specifications. When you write code, it’s not that you are given specifications and need to encode them. I suspect that has never happened in your life, nor will it happen in any foreseeable future. -But you still have to attempt to do something which is correct. -Of course the people who advocate writing specifications will say yes, first +But you still have to attempt to do something which is correct.

    + +

    Of course the people who advocate writing specifications will say yes, first you will write specification, and then implement specification. But, it’s not going to help. Because, if you write wrong specification, you are the same guy who is going to write the implementation. Most likely it will not make it correct.

    So it’s a deeper thing. -You have to establish correctness from more fundamental principles. +You have to establish correctness from more fundamental principles. The program is correct if it returns desirable information, if it does what it’s supposed to do in some absolute sense. It’s very hard to prove it.

    -

    i think one of the lessons of this particular lecture is -how hard simple things are. -lots of very bright people cannot give it a correct interface. -same with bsearch.

    +

    I think one of the lessons of this particular lecture is how hard simple things are. +lots of very bright people cannot give it a correct interface. Same with bsearch.

    -

    You might say, “Alex just talks about his beef -with the standard committtee.” +

    You might say, “Alex just talks about his beef with the standard committee.” No. -What I’m trying to tell you is that -when you write things like that in your code, +What I’m trying to tell you is that when you write things like that in your code, There will be some other guy using your code. Always think about that other guy. -The great flaw in most code I see is there is no consideration -for the other guy. +The great flaw in most code I see is there is no consideration for the other guy. People think, “oh it works, so it’s done.” My dream is that we all write code thinking about other people. -Then you say, “well, then i have to do more work.” +Then you say, “well, then I have to do more work.” This is the beauty of sharing. -Uou might have attended kindergarten -and had a teacher that taught you it’s good to share toys. +You might have attended kindergarten and had a teacher that taught you it’s good to share toys. She was right.

    Linear search

    -

    It would be very contrary to the way I do things to -start with binary search. +

    It would be very contrary to the way I do things to start with binary search. How could we do binary search if we cannot do linear search? -In STL it is called std::find or std::find_if3, -Let’s see how to write it. -We can assume we know how to do it, -and start from the top, -or we could assume we don’t know what are doing, -which is usually then case when starting new things. +In STL it is called std::find or std::find_if3.

    + +

    Let’s see how to write it. +We can assume we know how to do it, and start from the top, +or we could assume we don’t know what we are doing, +which is usually the case when starting new things. I seldom start writing code from the signature. -I typically have some algorithmic idea, -so I start with that, -often an inner loop. -Then write code inside out.

    +I don’t know what the signature is. +I typically have some algorithmic idea, often an inner loop, so I start with that, +Then write code inside out:

    while (first != last && ... find the element...) ++first;
     
    @@ -329,7 +193,7 @@

    Linear search

    // P is a unary predicate I find_if(I first, I last, P pred) { // [first, last) is a valid range. - while (first != last && !pred(first)) ++first; + while (first != last && !pred(*first)) ++first; return first; }
    @@ -340,16 +204,13 @@

    Linear search

    Trimming the standard

    -

    One of the mistakes which frequently happens is people -use the principle of Occam’s Razor -and say, “we need to only have one find_if”. +

    One of the mistakes which frequently happens is people use the principle of Occam’s Razor and say, “we need to only have one find_if”. That’s what happened. After I submitted STL it had many fine functions, but Bjarne was very afraid that STL was too large and would not be accepted, as is. -(It wasn’t that enormous at that point) +(It wasn’t that enormous at that point.) He said, “why don’t I come to Palo Alto (I was at HP Labs) -and bring along bunch of other standard committee people and we -will trim it”. +and bring along bunch of other standard committee people and we will trim it”. Trimming was a sad thing. Imagine somebody coming with a knife and cutting pieces of your flesh. One of the things he said was there should be only one find_if.

    @@ -414,15 +275,15 @@

    Helper functions

    }
    - -

    Bounded and counted ranges.

    + +

    Bounded and counted ranges

    Once upon a time I believed ranges come in two kinds5.

    1. Bounded ranges: Ranges bounded by a first and last iterator/pointer.
    2. Counted ranges: Ranges constructed from a first pointer/iterator and an - interger N.
    3. + integer N.
    @@ -430,7 +291,7 @@

    Bounded and counted ranges.

    Both are good, and both are different. You cannot say one is better than the other. It depends on the algorithm. -There used to be more bounded range algorithms in the STL, but they wre taken out. +There used to be more bounded range algorithms in the STL, but they were taken out. For example we have std::copy and std::copy_n both are really convenient.

    @@ -474,8 +335,7 @@

    Bounded and counted ranges.

    Advance and distance functions

    How does std::advance work? -It was introduced by me to allow -us to do long or fast thing depending on iterator type. +It was introduced by me to allow us to do long or fast thing depending on iterator type. For a pointer it will translate to one instruction. It’s going to be fast. In the case of a linked list, it’s going to be linear time. @@ -483,13 +343,13 @@

    Advance and distance functions

    template<typename I, typename N>
     inline
    -void advance{I& first, N n) {
    +void advance(I& first, N n) {
       advance(first, n, std::iterator_traits<I>::iterator_category);
     }
     
     template<typename I, typename N>
     inline
    -void advance{I& first, N n, std::input_iterator_tag) {
    +void advance(I& first, N n, std::input_iterator_tag) {
       while (n > 0) {
         ++first;
         --n;
    @@ -498,27 +358,27 @@ 

    Advance and distance functions

    template<typename I, typename N> inline -void advance{I& first, N n, std::random_access_iterator_tag) { +void advance(I& first, N n, std::random_access_iterator_tag) { first += n; }

    The dispatch between these is clearly done at compile time. -I could have had every iterator have += and - iterator. +I could have had every iterator have += and - operators. My thinking was that people have an expectation that those symbols are fast. -Making it linear time will confuse many people.

    +Making it linear time will confuse many people.6

    Now let’s implement std::distance:

    template<typename I>
     inline
    -typename std::iterator_traits<I>::difference_type distance{I& first, I& last) {
    +typename std::iterator_traits<I>::difference_type distance(I& first, I& last) {
       return distance(first, last, std::iterator_traits<I>::iterator_category);
     }
     
     template<typename I>
     inline
    -typename std::iterator_traits<I>::difference_type distance{I& first, I& last, std::input_iterator_tag) {
    +typename std::iterator_traits<I>::difference_type distance(I& first, I& last, std::input_iterator_tag) {
       typename std::iterator_traits<I>::difference_type n;
       while (first != last) {
         ++first;
    @@ -529,7 +389,7 @@ 

    Advance and distance functions

    template<typename I> inline -typename std::iterator_traits<I>::difference_type distance{I& first, I& last, std::random_access_iterator_tag) { +typename std::iterator_traits<I>::difference_type distance(I& first, I& last, std::random_access_iterator_tag) { return last - first; }
    @@ -546,23 +406,29 @@

    Code


    1. -He does now!
    2. +He has a Wikipedia page now!
    3. -Specifically the usage of function pointers and void* instead of generics.
    4. +C does not have a type safe form of generics like template. +This makes it difficult to write reusable components in the way Alex teaches. +The workaround used by bsearch and qsort is to return void * which is a pointer to any type.
    5. -Alex: “The name is stolen from Common Lisp find. +Alex: The name is stolen from Common Lisp find. Always try to borrow from some place. Originality is frowned upon, especially for naming. -Everyone loves to make non-standard names.”
    6. +Everyone loves to make non-standard names.
    7. -This is an amusing comment because -Alex just got done talking about why find_if_not -was such a helpful contribution -and how we need should consider the needs of the user -and give them various convenience interfaces for algorithms.
    8. +I find this comment amusing because +Alex just got done talking about why find_if_not was such a helpful contribution +It’s not clear how to reconcile his advice to carefully considering convenience of user, +with this comment about not wanting to provide convenient interfaces for algorithms.
    9. -These two kinds of ranges are discussed in depth in “Elements of Programming”.
    10. +These two kinds of ranges are discussed in depth in “Elements of Programming” chapter 6. +
    11. +Alex: I am not saying I was right, because when we were writing +“Elements of Programming”, Paul and I, we decided to abandon advance and distance, +and just say that depending on iterator category, +the complexity of these operators change.
    diff --git a/13_searching.md b/13_searching.md index 36559a8..47ec19c 100644 --- a/13_searching.md +++ b/13_searching.md @@ -9,10 +9,8 @@ Of course, you have never heard of him. He was the guy who invented first general purpose computer, but we don't remember people like that. In 1946 he gave a brilliant series of -lectures at the [Moore School][moore-school] at [Pennsylvania University][penn] -on programming. -For the first time, -he described things like merge, merge sort, and binary search, +lectures at the [Moore School][moore-school] at [Pennsylvania University][penn] on programming. +For the first time, he described things like merge, merge sort, and binary search. This is not a bad thing to be the first person to describe. He designed [ENIAC][eniac] which should make him very famous. Indeed, he did some very fundamental work. @@ -20,42 +18,32 @@ Indeed, he did some very fundamental work. Then comes this interesting fact (from "The Art of Computer Programming") It takes about 15 years for people to come up with binary search which sort of works for all possible inputs. -Apparently people didn't have trouble coding binary search when the length is -of the form `2^(n-1)`. -Because it's easy, you take the middle element and -then both sides will be of the same form -and you can keep dividing. +Apparently people didn't have trouble coding binary search when the length is of the form `2^(n-1)`. +Because it's easy, you take the middle element and then both sides will be of the same form and you can keep dividing. Apparently people couldn't do it. -Knuth claims that the first correct implementation was done by -[D.H. Lehmer][lehmer]. -He is someone you should know about -as a very great computer scientist. +Knuth claims that the first correct implementation was done by [D.H. Lehmer][lehmer]. +He is someone you should know about as a very great computer scientist. He did amazing amount of work on computational number theory, -things like sieves for discovering large primes and many other -important things. -Among other things, he published a binary search which at least -always terminated. +like sieves for discovering large primes and many other important things. +Among other things, he published a binary search which at least always terminated. I actually disagree with Knuth slightly and claim that the first correct binary search was published roughly at the same time, -but a couple of years after, -by a German computer science. +but a couple of years after, by a German computer scientist. Once again, he is unjustly forgotten. -He does not appear on Wikipedia[^wiki]. -His name is [Herman Bottenbruch][bottenbruch] -His claim to fame is he was one of their people who invented [Algol 58][algol-58] the predecessor -of Algol 60. +He does not appear on Wikipedia[^bottenbruch-not-on-wiki]. +His name is [Herman Bottenbruch][bottenbruch]. +His claim to fame is he was one of the people who invented [Algol 58][algol-58], the predecessor of Algol 60. He is one of the people who tried unsuccessfully to convince American delegates to Algol 58 committee that they should introduce block structures. He was one of the inventors of blocks. American representatives which included such brilliant people as [John Backus][backus] -and [Alan Perlis][perlis] actually reject it as too hard to implement. +and [Alan Perlis][perlis] actually rejected it as too hard to implement. They didn't know how to do stacks. -But sadly enough he doesn't get much credit, -especially credit for correct binary search. +But sadly enough he doesn't get much credit, especially credit for correct binary search. We will be actually studying his version. -[^wiki]: He does now! +[^bottenbruch-not-on-wiki]: He has a Wikipedia page [now][bottenbruch]! [mauchly]: https://en.wikipedia.org/wiki/John_Mauchly [moore-school]: https://en.wikipedia.org/wiki/Moore_School_of_Electrical_Engineering @@ -70,29 +58,23 @@ We will be actually studying his version. ## bsearch is wrong If we think about merging two sequences of roughly the same length, -or rather exactly the same length `n`, the expected number of comparison -is going to be a to `2n - 1`. -From which follows a conjecture. -If we have sequences -of size `n` and size `m` the number of comparisons should be `n + m - 1`. -Not every conjecture is true however, this one is definitely false. -Here is a simple counter example. -Take a sequence of length 1000 -and a sequence of length 1. -We only need `log(1000)` because -we can binary search for it's index. - -So there is a fundamental possibility +or rather exactly the same length `n`, the expected number of comparisons is going to be `2n - 1`. +From which follows a conjecture. +If we have sequences of size `n` and size `m` the number of comparisons should be `n + m - 1`. +Not every conjecture is true however. +This one is definitely false. +Here is a simple counterexample. +Take a sequence of length 1000 and a sequence of length 1. +We only need `log(1000)` because we can binary search for its index. + +So there is a fundamental possibility for using binary search for merging, dramatically reducing the number of comparisons. `log(n)` is much smaller than `n`. - -You might think we can just use binary search from -a standard library, such as C [`bsearch(3)`][bsearch]. +You might think we can just use binary search from a standard library, such as C [`bsearch(3)`][bsearch]. Sounds like a plausible idea. It was written by great UNIX guys. -They know something about programming, so let us see what they provide us with, -by quoting the man page: +They know something about programming, so let us see what they provide us with (see `man 3 bsearch`): void* bsearch( const void* key, @@ -102,22 +84,19 @@ by quoting the man page: int (*compare)(const void*, const void*) ); -Notice it takes all these parameters, -and it's a little messy because it's C. -It's hard for them[^bsearch-hard]. +Notice it takes all these parameters, and it's a little messy because it's C. +Components are hard for them[^bsearch-generics-hard]. Nevermind what it takes. What's interesting is what it returns. > returns a pointer to a matching member of the array, or NULL if no match is found. -So for our merge, it will most often return `NULL` . +So for our merge, it will most often return `NULL`. At which point, you will have to do linear search. -So observe, ancient interface, -done by brilliant people, -in the standard library and it's utterly useless. +So observe, ancient interface, done by brilliant people, in the standard library, and it's utterly useless. -Even if, we are so fortunte as to get a pointer to an element back. -Does it help with merge? Especially if we want to make it stable? +Even if we are so fortunate as to get a pointer to an element back, does it help with merge? +Especially if we want to make it stable? No. > If there are multiple elements that match the key, the element returned is unspecified. @@ -127,28 +106,29 @@ It's a typical story for binary search. Even when the book is written by famous people. I'll show you how to write it. -[^bsearch-hard]: Specifically the usage of function pointers and `void*` instead of generics. +[^bsearch-generics-hard]: C does not have a type safe form of generics like `template`. + This makes it difficult to write reusable components in the way Alex teaches. + The workaround used by `bsearch` and `qsort` is to return `void *` which is a pointer to any type. ### What is correct code? -Here comes another philosphical point. +Here comes another philosophical point. *What does wrong mean?* *What does incorrect mean?* At school they told you that the program is incorrect when *it doesn't satisfy its specifications*. -Well, then `bsearch` is a correct program. + +Well then `bsearch` is a correct program. I looked at the source, it does do what it promises to do. It will return `NULL`. I wish it were not correct. -I wished it returned something useful. +I wish it returned something useful. -Correctness is a deeper concept than -just satisfying specification. -Well in reality, as you guys know, -it must be deeper -because you haven't got any specifications. +Correctness is a deeper concept than just satisfying specification. +Well in reality, as you guys know, it must be deeper because you haven't got any specifications. When you write code, it's not that you are given specifications and need to encode them. I suspect that has never happened in your life, nor will it happen in any foreseeable future. But you still have to attempt to do something which is correct. + Of course the people who advocate writing specifications will say yes, first you will write specification, and then implement specification. But, it's not going to help. @@ -156,51 +136,43 @@ Because, if you write wrong specification, you are the same guy who is going to Most likely it will not make it correct. So it's a deeper thing. -You have to establish correctness from more fundamental principles. +You have to establish correctness from more fundamental principles. *The program is correct if it returns desirable information*, if it does what it's supposed to do in some absolute sense. It's very hard to prove it. -i think one of the lessons of this particular lecture is -how hard simple things are. -lots of very bright people cannot give it a correct interface. -same with `bsearch`. +I think one of the lessons of this particular lecture is how hard simple things are. +lots of very bright people cannot give it a correct interface. Same with `bsearch`. -You might say, "Alex just talks about his beef -with the standard committtee." +You might say, "Alex just talks about his beef with the standard committee." No. -What I'm trying to tell you is that -when you write things like that in your code, +What I'm trying to tell you is that when you write things like that in your code, There will be some other guy using your code. Always think about that other guy. -The great flaw in most code I see is there is no consideration -for the other guy. +The great flaw in most code I see is there is no consideration for the other guy. People think, "oh it works, so it's done." My dream is that we all write code thinking about other people. -Then you say, "well, then i have to do more work." +Then you say, "well, then I have to do more work." This is the beauty of sharing. -Uou might have attended kindergarten -and had a teacher that taught you it's good to share toys. +You might have attended kindergarten and had a teacher that taught you it's good to share toys. She was right. [bsearch]: https://man7.org/linux/man-pages/man3/bsearch.3.html ## Linear search -It would be very contrary to the way I do things to -start with binary search. +It would be very contrary to the way I do things to start with binary search. How could we do binary search if we cannot do linear search? -In STL it is called [`std::find`][cpp-find] or `std::find_if`[^name], +In STL it is called [`std::find`][cpp-find] or `std::find_if`[^name-of-find-function]. + Let's see how to write it. -We can assume we know how to do it, -and start from the top, -or we could assume we don't know what are doing, -which is usually then case when starting new things. +We can assume we know how to do it, and start from the top, +or we could assume we don't know what we are doing, +which is usually the case when starting new things. I seldom start writing code from the signature. -I typically have some algorithmic idea, -so I start with that, -often an inner loop. -Then write code inside out. +I don't know what the signature is. +I typically have some algorithmic idea, often an inner loop, so I start with that, +Then write code inside out: while (first != last && ... find the element...) ++first; @@ -211,7 +183,7 @@ Now write the signature: // P is a unary predicate I find_if(I first, I last, P pred) { // [first, last) is a valid range. - while (first != last && !pred(first)) ++first; + while (first != last && !pred(*first)) ++first; return first; } @@ -220,16 +192,13 @@ because it is single pass. ### Trimming the standard -One of the mistakes which frequently happens is people -use the principle of [Occam's Razor][razor] -and say, "we need to only have one `find_if`". +One of the mistakes which frequently happens is people use the principle of [Occam's Razor][razor] and say, "we need to only have one `find_if`". That's what happened. After I submitted STL it had many fine functions, but Bjarne was very afraid that STL was too large and would not be accepted, as is. -(It wasn't that enormous at that point) +(It wasn't that enormous at that point.) He said, "why don't I come to Palo Alto (I was at HP Labs) -and bring along bunch of other standard committee people and we -will trim it". +and bring along bunch of other standard committee people and we will trim it". Trimming was a sad thing. Imagine somebody coming with a knife and cutting pieces of your flesh. One of the things he said was there should be only one `find_if`. @@ -296,19 +265,19 @@ They are just wrappers[^wrapper]. [cpp-not1]: https://en.cppreference.com/w/cpp/utility/functional/not1 [cpp-not2]: https://en.cppreference.com/w/cpp/utility/functional/not2 -## Bounded and counted ranges. +## Bounded and counted ranges Once upon a time I believed ranges come in two kinds[^eop-range-kinds]. 1. **Bounded ranges**: Ranges bounded by a first and last iterator/pointer. 2. **Counted ranges**: Ranges constructed from a first pointer/iterator and an - interger `N`. + integer `N`. Which one is better? Both are good, and both are different. You cannot say one is better than the other. It depends on the algorithm. -There used to be more bounded range algorithms in the STL, but they wre taken out. +There used to be more bounded range algorithms in the STL, but they were taken out. For example we have [`std::copy`][cpp-copy] and [`std::copy_n`][cpp-copy-n] both are really convenient. @@ -346,19 +315,18 @@ So let's try writing `find_if_n` return std::make_pair(first, n); } -[^name]: Alex: "The name is stolen from Common Lisp [`find`][clhs-find]. +[^name-of-find-function]: Alex: The name is stolen from Common Lisp [`find`][clhs-find]. Always try to borrow from some place. Originality is frowned upon, especially for naming. - Everyone loves to make non-standard names." + Everyone loves to make non-standard names. -[^wrapper]: This is an amusing comment because - Alex just got done talking about why `find_if_not` - was such a helpful contribution - and how we need should consider the needs of the user - and give them various convenience interfaces for algorithms. +[^wrapper]: I find this comment amusing because + Alex just got done talking about why `find_if_not` was such a helpful contribution + It's not clear how to reconcile his advice to carefully considering convenience of user, + with this comment about not wanting to provide convenient interfaces for algorithms. -[^eop-range-kinds]: These two kinds of ranges are discussed in depth in "Elements of Programming". +[^eop-range-kinds]: These two kinds of ranges are discussed in depth in "Elements of Programming" chapter 6. [cpp-find]: https://en.cppreference.com/w/cpp/algorithm/find [clhs-find]: http://clhs.lisp.se/Body/f_find_.htm @@ -369,8 +337,7 @@ So let's try writing `find_if_n` ## Advance and distance functions How does [`std::advance`][cpp-advance] work? -It was introduced by me to allow -us to do long or fast thing depending on iterator type. +It was introduced by me to allow us to do long or fast thing depending on iterator type. For a pointer it will translate to one instruction. It's going to be fast. In the case of a linked list, it's going to be linear time. @@ -378,13 +345,13 @@ Here is how it works: template inline - void advance{I& first, N n) { + void advance(I& first, N n) { advance(first, n, std::iterator_traits::iterator_category); } template inline - void advance{I& first, N n, std::input_iterator_tag) { + void advance(I& first, N n, std::input_iterator_tag) { while (n > 0) { ++first; --n; @@ -393,26 +360,26 @@ Here is how it works: template inline - void advance{I& first, N n, std::random_access_iterator_tag) { + void advance(I& first, N n, std::random_access_iterator_tag) { first += n; } The dispatch between these is clearly done at compile time. -I could have had every iterator have `+=` and `-` iterator. +I could have had every iterator have `+=` and `-` operators. My thinking was that people have an expectation that those symbols are fast. -Making it linear time will confuse many people. +Making it linear time will confuse many people.[^eop-advance-distance] Now let's implement [`std::distance`][cpp-distance]: template inline - typename std::iterator_traits::difference_type distance{I& first, I& last) { + typename std::iterator_traits::difference_type distance(I& first, I& last) { return distance(first, last, std::iterator_traits::iterator_category); } template inline - typename std::iterator_traits::difference_type distance{I& first, I& last, std::input_iterator_tag) { + typename std::iterator_traits::difference_type distance(I& first, I& last, std::input_iterator_tag) { typename std::iterator_traits::difference_type n; while (first != last) { ++first; @@ -423,16 +390,18 @@ Now let's implement [`std::distance`][cpp-distance]: template inline - typename std::iterator_traits::difference_type distance{I& first, I& last, std::random_access_iterator_tag) { + typename std::iterator_traits::difference_type distance(I& first, I& last, std::random_access_iterator_tag) { return last - first; } +[^eop-advance-distance]: Alex: I am not saying I was right, because when we were writing + "Elements of Programming", Paul and I, we decided to abandon advance and distance, + and just say that depending on iterator category, + the complexity of these operators change. [cpp-advance]: https://en.cppreference.com/w/cpp/iterator/advance [cpp-distance]: https://en.cppreference.com/w/cpp/iterator/distance - - ## Code - [algorithm.h](code/algorithm.h) diff --git a/14_binary_search.html b/14_binary_search.html index c93b651..82f436b 100644 --- a/14_binary_search.html +++ b/14_binary_search.html @@ -2,116 +2,9 @@ + 14. Binary Search - + @@ -135,16 +28,14 @@

    Bisection in math

    Bisection is an idea which is pretty ancient. It was first used in a mathematical setting by Lagrange who applied it for -finding roots of polynomials that was around 1796. -Two people independently discovered +finding roots of polynomials, that was around 1796. +Two people independently discovered it in late 1810s. The first is Bernard Bolzano the second is Augustin-Louis Cauchy. They both invented the very famous theorem which is called bisection. Where does it appear in mathematics? -The intermediate value theorem. -If you have a function f which goes from negative to positive, -it has to cross zero.

    +The intermediate value theorem, if you have a continuous function f which goes from negative to positive, it has to cross zero.

                    ____f(b) > 0
                    /
    @@ -170,10 +61,9 @@ 

    Partitions

    We are still struggling on our path to binary search1. It deals with two things. -First -it deals with a monotonically non-decreasing sequence. +First, it deals with a monotonically non-decreasing sequence. Second of all it has some -binary predicate which establishes StrictWeakOrdering +binary predicate which establishes StrictWeakOrdering on the sequence and which allows you to compare elements. This is too much, even for people who design APIs. In order to write it correctly we need to reduce it to a simpler problem.

    @@ -181,9 +71,7 @@

    Partitions

    Even simpler than the notion of a sorted sequence is a partitioned sequence. A sequence is partitioned based on a predicate. -A sequence is partitioned if the predicate -is true for some sub range of items, -and then false for the rest[^order].

    +A sequence is partitioned if the predicate is true for some sub range of items, and then false for the rest[^order].

    Is partitioned

    @@ -210,12 +98,12 @@

    Is partitioned

    inline bool is_partitioned(I first, I last, P pred) { first = find_if_not(first, last, pred); - find_if(first, last, pred); + first = find_if(first, last, pred); return first == last; }
    -

    Now for bounded ranges:

    +

    Now for counted ranges:

    template<typename I, typename N, typename P>
     // I is InputIterator
    @@ -233,18 +121,15 @@ 

    Is partitioned

    Partition point

    -

    When we partition we will have true guys -followed by false:

    +

    When we partition we will have true guys followed by false:

    T T T F F F
           ^
     
    -

    There is only one special thing, -the partition point. -If we understand the partition -point everything else will be very simple and there is no ambiguity. -find_if actually finds the partitioned point. +

    There is only one special thing, the partition point. +If we understand the partition point everything else will be very simple and there is no ambiguity. +find_if actually finds the partition point. But, it does too many predicate applications. We could do better if our range is at least forward iterable.

    @@ -253,64 +138,60 @@

    Partition point

    while iteration is relatively fast3, you still could reduce the number of tests to log(n). As we shall see we have a very good bound on the number of traversal -operations which is literally n, not order of n, +operations which is literally n, not order of n. So we can get it so it works for everything. -Then it works on arrays much much faster -than linked lists.

    +Then it works on arrays much much faster than linked lists.

    -

    A distinguished computer scientist recently -asked me, “what if we just make it linear. will it really affect practical algorithms”. -The answer is yes, -very badly.

    +

    A distinguished computer scientist recently asked me, “what if we just make it linear. will it really affect practical algorithms”. +The answer is yes, very badly.

    The algorithm to find it faster is to test the middle. How do we go to the middle? Divide by 2. -Dividing numbers is easier -so we will start with a counted range, instead of bounded.

    +Dividing numbers is easier so we will start with a counted range, instead of bounded.

    template<typename I, typename N, typename P>
    -// I is InputIterator
    -// P is unary predicate 
    +// I is ForwardIterator
    +// P is UnaryPredicate 
     // N is integral type
     // value_type of I == argument_type of P
     inline
    -I partitioned_point_n(I first, N n, P pred) {
    +I partition_point_n(I first, N n, P pred) {
       while (n) {
         N half = n >> 1;
         I middle = first;
         std::advance(middle, half);
         if (pred(*middle)) {
    -      n = half;
    -    } else {
           ++middle;
           first = middle;
           n -= (half + 1);
    +    } else {
    +      n = half;
         }
       }
       return first;
     }
     
    -

    Why a shift (n >> 1)? We know it’s non-negative. +

    Why did I use a shift (n >> 1)? We know it’s non-negative. I’m a penny pincher. -Maybe the compiler will automatically do it for n / 2 -maybe it will not. +Maybe the compiler will automatically do it for n / 2 maybe it will not. Now it will.

    -

    How many ++ operations do we do? -n /2 + n/4 + ... + = n. -We are traversing more than linear -search on the average case. +

    How many ++ operations do we do?

    + +
    n/2 + n/4 + ... + = n.
    +
    + +

    We are traversing more than linear search on the average case. We are also not trying to be lucky and find equal.

    template<typename I, typename I, typename P>
    -// I is InputIterator
    -// P is unary predicate 
    -// N is integral type
    +// I is ForwardIterator
    +// P is UnaryPredicate 
     // value_type of I == argument_type of P
     inline
    -I partitioned_point_n(I first, I last, P pred) {
    +I partition_point(I first, I last, P pred) {
       return partition_point_n(first, std::distance(first, last), pred);
     }
     
    @@ -329,7 +210,7 @@

    Upper and lower bound

    Increasing is too much. If I want to sort my coins and there are two pennies however much I want to sort, -I’m no going to have an ascending sequence. +I’m not going to have an ascending sequence. What we need to guarantee is that for no pair of elements x_i, x_j where j > i that x_i > x_j.

    @@ -391,7 +272,7 @@

    Is sorted

    and there is a profound relation between is_sorted and adjacent_find but you’re going to discover it yourself. Write a program that uses std::adjacent_find and - think about this relationship.

    + try to figure out this relationship.

    Binary search with partition points

    @@ -423,24 +304,21 @@

    Binary search with partition points

    because they only look for 5.

    As far as I could ascertain, these names are invented by me but I think they’re good names, -upper_bound and lower_bound. there are two. +upper_bound and lower_bound, there are two. So what property does the upper bound have? It is the first element which is greater. -Both lower bound and upper -bound split our range into two ranges. +Both lower bound and upper bound split our range into two ranges. So in some sense we actually have 3 ranges:

    [first [lower)   [upper)  last)
     
      1. [first, lower)
      2. [lower, upper)
    - 3. [uppper, last)
    + 3. [upper, last)
     
    -

    You can actually find them both together -a little faster, than separately. -There is a function std::equal_range -which does that.

    +

    You can actually find them both together a little faster, than you can separately5. +There is a function std::equal_range which does that.

    First let’s implement a function object which is our predicate for partitioning. @@ -462,7 +340,7 @@

    Binary search with partition points

    Now we can write lower_bound, which is the version -of “binary search” that we want5:

    +of “binary search” that we want6:

    template <typename I, typename N, typename R>
     // I is ForwardIterator
    @@ -509,12 +387,12 @@ 

    Project: Partitioning

    I want you to think about this partitioning, especially in terms of our wonderful binary_counter device. However, we also want the partition to be stable. -You want to move all the bad guys up front and good guys the tail end. +You want to move all the bad guys up front and good guys to the tail end. You will return an iterator pointing to the partition point separating good from bad. This partition is stable if the relative order of good guys and bad guys is maintained, -meaning if I have a alphabetically sorted list of employees and I +meaning if I have an alphabetically sorted list of employees and I want to divide them by say gender, if I partition them, they will still be alphabetically sorted.

    @@ -540,8 +418,8 @@

    Code


    1. -Alex: Paul McJones has a good friend Butler Lampson who is a Turing award -winner. We went to lunch and he told us binary search is the +Alex: Paul McJones has a good friend Butler Lampson who is a Turing award winner. +We went to lunch and he told us binary search is the only algorithm a programmer needs to know. I think sort should be there too, but we’ll take his opinion.
    2. @@ -549,19 +427,18 @@

      Code

      because they just don’t listen to arguments. Whatever arguments you give them, they just say, “It’s the standard”.
    3. -In writing my masters thesis -I actually came across a comparison operator which is very expensive to evaluate. -It is the Dehornoy ordering on braid groups. -This is a total ordering which I used for sorting, removing duplicates, -and other algorithms in a very similar style to STL. -Optimizing the number of comparisons made a large difference.
    4. +In writing my Master’s thesis +I actually came across a comparison operator which is very expensive to evaluate, +called the Dehornoy ordering for braid groups. +This provides a total ordering which I used for sorting, removing duplicates, and other algorithms in STL style. +In this case having +algorithms that carefully optimized the number of comparisons made a significant different in performance.
    5. -

      Centering binary search around predicates solves an important problem. -One thing you might want to do is binary search an array of records by -a particular field. +

      Framing binary search as finding the partition point solves an important problem. +Suppose you want to search an array of records by a particular field. You can of course make a record having arbitrary values for all the other fields besides the one you care about, -but it would be nice to just provide the one you want. +but it would be nice to provide just the key you want. For example:

      binary_search(
      @@ -572,18 +449,22 @@ 

      Code

      );
      -

      This does not fit with the traditional theory of StrictWeakOrdering as -Name and Person are not elements of the same domain. -The comparison function is no longer an ordering -at all, or even an operation.

      +

      But what is the theoretical basis for a function comparing a key to a person record? +It’s not a StrictWeakOrdering as Name and Person are not elements of the same domain. +In this case the comparison function is no longer an ordering at all, or even an operation. +If we condense the key and comparison into a predicate, and find the partition point, then this problem goes away.

      -

      By defining binary search as a partition point then this problem goes away. -Apparently this information was lost, -and you can see some of their discussions of this -topic here and here.

    6. +

      It appears the C++ standards committee was confused about this for some time. +See “Binary Search with Heterogeneous Comparison” +and “Binary search requirements overly strict” for further discussion.

    7. +

      Here is one improvement. +First find the lower bound in the range [first, last). +Then find the upper bound in the range [lower, last). +Further optimization is possible.

    8. +
    9. Alex: If you remember there was this -grand review one day where the committee through out a bunch of useful functions. +grand review one day where the committee threw out a bunch of useful functions. Well they inserted some too. One is called std::binary_search. A friend asked me, “where is binary search?” @@ -592,7 +473,7 @@

      Code

      and he said, “but, where is binary search?”. So, I did it for him. Who can argue with a best friend. -Will I ever use it? No.
    10. +Will I ever use it? No.
    diff --git a/14_binary_search.md b/14_binary_search.md index ae48ab9..b6143f4 100644 --- a/14_binary_search.md +++ b/14_binary_search.md @@ -5,16 +5,14 @@ Bisection is an idea which is pretty ancient. It was first used in a mathematical setting by [Lagrange][lagrange] who applied it for -finding roots of polynomials that was around 1796. -Two people independently discovered +finding roots of polynomials, that was around 1796. +Two people independently discovered it in late 1810s. The first is [Bernard Bolzano][bolzano] the second is [Augustin-Louis Cauchy][cauchy]. They both invented the very famous theorem which is called [bisection][bisection]. Where does it appear in mathematics? -The [intermediate value theorem][ivt]. -If you have a function `f` which goes from negative to positive, -it has to cross zero. +The [intermediate value theorem][ivt], if you have a [continuous function][continuous-function] `f` which goes from negative to positive, it has to cross zero. ____f(b) > 0 / @@ -41,16 +39,15 @@ It is a great idea. [bisection]: https://en.wikipedia.org/wiki/Bisection_method [bolzano-theorem]: https://en.wikipedia.org/wiki/Bolzano%E2%80%93Weierstrass_theorem [ivt]: https://en.wikipedia.org/wiki/Intermediate_value_theorem - +[continuous-function]: https://en.wikipedia.org/wiki/Continuous_function ## Partitions -We are still struggling on our path to binary search[^paul]. +We are still struggling on our path to binary search[^pauls-friend]. It deals with two things. -First -it deals with a monotonically non-decreasing sequence. +First, it deals with a monotonically non-decreasing sequence. Second of all it has some -binary predicate which establishes `StrictWeakOrdering` +binary predicate which establishes `StrictWeakOrdering` on the sequence and which allows you to compare elements. This is too much, even for people who design APIs. In order to write it correctly we need to reduce it to a simpler problem. @@ -58,9 +55,7 @@ In order to write it correctly we need to reduce it to a simpler problem. Even simpler than the notion of a sorted sequence is a **partitioned sequence**. A sequence is partitioned based on a predicate. -A sequence is partitioned if the predicate -is true for some sub range of items, -and then false for the rest[^order]. +A sequence is partitioned if the predicate is true for some sub range of items, and then false for the rest[^order]. ### Is partitioned @@ -76,7 +71,7 @@ Don't you want satisfying things? But it's wrong. We want partition sequence to be sorted on the boolean values and since STL assumes that ascending order is natural, -The right thing to do is make partition consistent[^changing-standard], false values go first. +The right thing to do is make partition consistent[^changing-standard-difficult], false values go first. But, for our course we will follow the standard. template @@ -86,11 +81,11 @@ But, for our course we will follow the standard. inline bool is_partitioned(I first, I last, P pred) { first = find_if_not(first, last, pred); - find_if(first, last, pred); + first = find_if(first, last, pred); return first == last; } -Now for bounded ranges: +Now for counted ranges: template // I is InputIterator @@ -105,23 +100,20 @@ Now for bounded ranges: } -[^changing-standard]: Alex: Changing something in this standard is impossible, +[^changing-standard-difficult]: Alex: Changing something in this standard is impossible, because they just don't listen to arguments. Whatever arguments you give them, they just say, "It's the standard". ### Partition point -When we partition we will have true guys -followed by false: +When we partition we will have true guys followed by false: T T T F F F ^ -There is only one special thing, -the partition point. -If we understand the partition -point everything else will be very simple and there is no ambiguity. -`find_if` actually finds the partitioned point. +There is only one special thing, the partition point. +If we understand the partition point everything else will be very simple and there is no ambiguity. +`find_if` actually finds the partition point. But, it does too many predicate applications. We could do better if our range is at least forward iterable. @@ -130,85 +122,79 @@ if your predicate is very expensive, while iteration is relatively fast[^slow-comparison-example], you still could reduce the number of tests to `log(n)`. As we shall see we have a very good bound on the number of traversal -operations which is literally `n`, not order of `n`, +operations which is literally `n`, not order of `n`. So we can get it so it works for everything. -Then it works on arrays much much faster -than linked lists. +Then it works on arrays much much faster than linked lists. -A distinguished computer scientist recently -asked me, "what if we just make it linear. will it really affect practical algorithms". -The answer is yes, -very badly. +A distinguished computer scientist recently asked me, "what if we just make it linear. will it really affect practical algorithms". +The answer is yes, very badly. The algorithm to find it faster is to test the middle. How do we go to the middle? Divide by 2. -Dividing numbers is easier -so we will start with a counted range, instead of bounded. +Dividing numbers is easier so we will start with a counted range, instead of bounded. template - // I is InputIterator - // P is unary predicate + // I is ForwardIterator + // P is UnaryPredicate // N is integral type // value_type of I == argument_type of P inline - I partitioned_point_n(I first, N n, P pred) { + I partition_point_n(I first, N n, P pred) { while (n) { N half = n >> 1; I middle = first; std::advance(middle, half); if (pred(*middle)) { - n = half; - } else { ++middle; first = middle; n -= (half + 1); + } else { + n = half; } } return first; } -Why a shift (`n >> 1`)? We know it's non-negative. +Why did I use a shift (`n >> 1`)? We know it's non-negative. I'm a penny pincher. -Maybe the compiler will automatically do it for `n / 2` -maybe it will not. +Maybe the compiler will automatically do it for `n / 2` maybe it will not. Now it will. How many `++` operations do we do? -`n /2 + n/4 + ... + = n`. -We are traversing more than linear -search on the average case. + + n/2 + n/4 + ... + = n. + +We are traversing more than linear search on the average case. We are also not trying to be lucky and find equal. template - // I is InputIterator - // P is unary predicate - // N is integral type + // I is ForwardIterator + // P is UnaryPredicate // value_type of I == argument_type of P inline - I partitioned_point_n(I first, I last, P pred) { + I partition_point(I first, I last, P pred) { return partition_point_n(first, std::distance(first, last), pred); } -[^paul]: Alex: Paul McJones has a good friend [Butler Lampson][lampson] who is a Turing award - winner. We went to lunch and he told us binary search is the +[^pauls-friend]: Alex: Paul McJones has a good friend [Butler Lampson][lampson] who is a Turing award winner. + We went to lunch and he told us binary search is the only algorithm a programmer needs to know. I think sort should be there too, but we'll take his opinion. -[^slow-comparison-example]: In writing my masters thesis - I actually came across a comparison operator which is very expensive to evaluate. - It is the Dehornoy ordering on [braid groups][braid-research]. - This is a total ordering which I used for sorting, removing duplicates, - and other algorithms in a very similar style to STL. - Optimizing the number of comparisons made a large difference. +[^slow-comparison-example]: In writing my Master's thesis + I actually came across a comparison operator which is very expensive to evaluate, + called the Dehornoy ordering for [braid groups][braid-research]. + This provides a total ordering which I used for sorting, removing duplicates, and other algorithms in STL style. + In this case having + algorithms that carefully optimized the number of comparisons made a significant different in performance. [lampson]: https://en.wikipedia.org/wiki/Butler_Lampson [braid-research]: https://github.com/justinmeiners/braid-rank-thesis ## Upper and lower bound - We need to talk a little bit about sorted ranges. A precondition to binary search is not that the range is partitioned, but it is sorted. @@ -220,7 +206,7 @@ then you would be wrong. Increasing is too much. If I want to sort my coins and there are two pennies however much I want to sort, -I'm no going to have an ascending sequence. +I'm not going to have an ascending sequence. What we need to guarantee is that for no pair of elements `x_i`, `x_j` where `j > i` that `x_i > x_j`. @@ -280,7 +266,7 @@ I made a choice that `<` is the primitive one. and there is a profound relation between `is_sorted` and `adjacent_find` but you're going to discover it yourself. Write a program that uses `std::adjacent_find` and - think about this relationship. + try to figure out this relationship. [cpp-adjacent-find]: https://en.cppreference.com/w/cpp/algorithm/adjacent_find @@ -312,30 +298,32 @@ to all these bad binary search that return `-1`, because they only look for `5`. As far as I could ascertain, these names are invented by me but I think they're good names, -`upper_bound` and `lower_bound`. there are two. +`upper_bound` and `lower_bound`, there are two. So what property does the upper bound have? It is the first element which is greater. -Both lower bound and upper -bound split our range into two ranges. +Both lower bound and upper bound split our range into two ranges. So in some sense we actually have 3 ranges: [first [lower) [upper) last) 1. [first, lower) 2. [lower, upper) - 3. [uppper, last) + 3. [upper, last) -You can actually find them both together -a little faster, than separately. -There is a function [`std::equal_range`][cpp-equal-range] -which does that. +You can actually find them both together a little faster, than you can separately[^clarify-equal-range]. +There is a function [`std::equal_range`][cpp-equal-range] which does that. -[^partition-point]: Centering binary search around predicates solves an important problem. - One thing you might want to do is binary search an array of records by - a particular field. +[^clarify-equal-range]: + Here is one improvement. + First find the lower bound in the range `[first, last)`. + Then find the upper bound in the range `[lower, last)`. + Further optimization is possible. + +[^partition-point]: Framing binary search as finding the partition point solves an important problem. + Suppose you want to search an array of records by a particular field. You can of course make a record having arbitrary values for all the other fields besides the one you care about, - but it would be nice to just provide the one you want. + but it would be nice to provide just the key you want. For example: binary_search( @@ -345,22 +333,20 @@ which does that. [](Person a, const char* name){ a.name < name } ); - This does not fit with the traditional theory of `StrictWeakOrdering` as - `Name` and `Person` are not elements of the same domain. - The comparison function is no longer an ordering - at all, or even an operation. + But what is the theoretical basis for a function comparing a key to a person record? + It's not a `StrictWeakOrdering` as `Name` and `Person` are not elements of the same domain. + In this case the comparison function is no longer an ordering at all, or even an operation. + If we condense the key and comparison into a predicate, and find the partition point, then this problem goes away. - By defining binary search as a partition point then this problem goes away. - Apparently this information was lost, - and you can see some of their discussions of this - topic [here][standards-1] and [here][standards-2]. + It appears the C++ standards committee was confused about this for some time. + See ["Binary Search with Heterogeneous Comparison"][binary-search-standards-1] + and ["Binary search requirements overly strict"][binary-search-standards-2] for further discussion. -[standards-1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2001/n1313.html -[standards-2]: https://cplusplus.github.io/LWG/issue270 +[binary-search-standards-1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2001/n1313.html +[binary-search-standards-2]: https://cplusplus.github.io/LWG/issue270 [cpp-equal-range]: https://en.cppreference.com/w/cpp/algorithm/equal_range [cpp-binary-search]: https://en.cppreference.com/w/cpp/algorithm/binary_search - First let's implement a function object which is our predicate for partitioning. It is defined by `P(x) = a < x` @@ -409,7 +395,7 @@ found in the final code. [^binary-search]: Alex: If you remember there was this - grand review one day where the committee through out a bunch of useful functions. + grand review one day where the committee threw out a bunch of useful functions. Well they inserted some too. One is called [`std::binary_search`][cpp-binary-search]. A friend asked me, "where is binary search?" @@ -438,12 +424,12 @@ of expensive `n log(n)` algorithms. I want you to think about this partitioning, especially in terms of our wonderful `binary_counter` device. However, we also want the partition to be **stable**. -You want to move all the bad guys up front and good guys the tail end. +You want to move all the bad guys up front and good guys to the tail end. You will return an iterator pointing to the partition point separating good from bad. This partition is *stable if the relative order of good guys and bad guys is maintained*, -meaning if I have a alphabetically sorted list of employees and I +meaning if I have an alphabetically sorted list of employees and I want to divide them by say gender, if I partition them, they will still be alphabetically sorted. diff --git a/15_merge_inplace.html b/15_merge_inplace.html index e368063..166428c 100644 --- a/15_merge_inplace.html +++ b/15_merge_inplace.html @@ -2,116 +2,9 @@ + 15. Merge inplace - + @@ -140,58 +33,44 @@

    Reinventing things

    The first solution was done by a Russian computer scientist Alexander Kronrod, but it wasn’t stable. -There was another solution -by a wonderful Argentinian computer scientist -Luis Pardo (Knuth’s student).

    - -

    So, was thinking and thinking and thinking and once I was -waking up in the phase between Twilight, when you wake up but still see -dreams but you’re not quite asleep. -I saw the algorithm on the board, it does happen. +There was another solution by a wonderful Argentinian computer scientist Luis Pardo (Knuth’s student).

    + +

    So, I was thinking and thinking and thinking and once I was waking up in the phase between Twilight, +when you wake up but still see dreams but you’re not quite asleep. +I saw the algorithm on the board, it does happen. I was ecstatic. I think it was 1984. What do you do if you find a really beautiful algorithm? If you’re me, you call your friends. -So I called Dave Musser -and said, “this is absolutely gorgeous” -and he agreed. +So I called Dave Musser and said, “this is absolutely gorgeous” and he agreed. I started implementing and doing measurements. Meanwhile, he starts telling faculty around him.

    -

    Here comes the bad news. One of his colleagues, -Erich Kaltofen who is a distinguished -specialist in computer algebra, -came to Dave and said “Yeah it’s nice, -but there are these two Polish guys -Dudzinski and Dydek -who published it in 1981 (two years before).” +

    Here comes the bad news. +One of his colleagues, Erich Kaltofen who is a distinguished specialist in computer algebra, came to Dave and said +“Yeah it’s nice, but there are these two Polish guys Dudzinski and Dydek who published it in 19811 (two years before).” It was very sad. -But, we often reinvent things.

    +But, we often reinvent things. +Meanwhile Knuth heard about it from his friend Vaughan Pratt who told him about it. +So he gives the attribution to his friend, and not the two polish guys2.

    -

    Meanwhile Knuth heard about it from his friend -Ian Pratt who told him about it. -So he gives the attribution to his friend, -and not the two polish guys. -As an algorithm itself, it’s utterly useless. -Sometimes algorithms published by theoreticians -can be used by us after appropriate modification.

    +

    As an algorithm itself, it’s utterly useless. +But sometimes algorithms published by theoreticians can be used by us, after appropriate modification.

    Merging adjacent lists

    -

    We got into all of this because we were -thinking about merge. -We already learned to merge linked list. +

    We got into all of this because we were thinking about merge. +We already learned to merge linked list (chapter 12). I introduced this thing called goto. Today I’m going to teach you some other bad programming practices because it’s always pleasant to introduce techniques which are known to be bad in general when they’re good in particular.

    -

    We’re going to to go about things in a funny way. +

    We’re going to go about things in a funny way. Normally when I teach merge, -we then realize it can be used for sorting, -we write merge sort, -and then we realize that merge sort needs extra memory, +we then realize it can be used for sorting. +We write merge sort and then we realize that merge sort needs extra memory, and say “oh it’s very unfortunate. Couldn’t we find merge sort that doesn’t require extra memory?” This time around, I decided to follow a different path. @@ -201,7 +80,7 @@

    Merging adjacent lists

    into fast code. Especially if it allows you to create something architecturally nice and see connections. -So we are going to look at the much harder problem of in-place merge +So we are going to look at the much harder problem of in-place merge. The problem is actually hard if you think about it.

    @@ -225,7 +104,6 @@

    Interface

    But our algorithm will greatly benefit from counted ranges. When do we need counted ranges? When we do bisection, or something like binary search. -I like to use counted ranges. We will use two of them:

    [first1, count1)
    @@ -243,10 +121,8 @@ 

    Algorithm

    advice is to look for divide and conquer. The following is a graph of the two sorted ranges adjacent to each other. The graph for each moves up and to the right to illustrate they are ascending. -When we have lots of variables, -naming doesn’t work. -We have to use one letter names with indices, -like math.

    +When we have lots of variables naming doesn’t work. +We have to use one letter names with indices, like math.

        n0     n1
        /      /
    @@ -256,18 +132,15 @@ 

    Algorithm

    f0 f1
    -

    We will first bisect one of the ranges and pick a guy -from the middle. -Then we ask, “where would it fit in the other sequence?”. +

    We will first bisect one of the ranges and pick a guy from the middle. +Then we ask, “where would it fit in the other sequence?” Do we have a function for that? We do. It’s called lower_bound.

    Assume we bisect the left. -So then we let f0_0 = f0, -and f0_1 be the bisection -of the first interval. +Then let f0_0 = f0 and f0_1 be the bisection of the first interval. Then f1_1 is found from the right using lower_bound. -(that should take O(log(n)) comparisons).

    +(that should take O(log(n)) comparisons).

        n0     n1
        /      /
    @@ -278,11 +151,11 @@ 

    Algorithm

    f0_0 f1
    -

    Now we are going to rotate (C++11 std::rotate1). +

    Now we are going to rotate. Rotating swaps elements in the range [f0_1, f1_1) in such a way that f1 becomes the first. -[f0_0, f0_1) and [f1_1, n1) remain fixed.

    +[f0_0, f0_1) and [f1_1, n1) remain fixed (see std::rotate).

                    n0     n1
                     /      /
    @@ -294,12 +167,10 @@ 

    Algorithm

    f0_0 f1
    -

    Now we know f0_1 is in his rightful place, +

    Now we know f0_1 is in his rightful place (all lower are to the left and all greater are to the right), so we will let x = f0_1, -as he won’t be moved again (all lower are to the left) -(all greater are to the right). -So, we shrink the range -by one step by assigning f1_0 = f0_1 + 1, +as he won’t be moved again +So, we shrink the range by one step by assigning f1_0 = f0_1 + 1, and then just figure out the lengths of the remaining intervals:

    @@ -366,6 +237,9 @@

    Implementation

    }
    +

    Note that this std::rotate is the C++11 version which returns an iterator +rather than void3.

    +

    Now implement the right sub-problem, it is basically the same idea. The fundamental difference is upper bound, @@ -451,7 +325,7 @@

    Naming things

    I’m just emphasizing a very important naming principle which is to think about indexing. -think about indexing things you you know like +think about indexing things you know like find_if, not if_find. Suffixes should be sorted in the order of importance.

    @@ -464,7 +338,7 @@

    Sort from merge

    The second is that a one element list is sorted. Now, we just call it recursively and ask it to sort itself. -Nobody does any work except merge_inplace_n2.

    +Nobody does any work except merge_inplace_n4.

    template <typename I, typename N, typename R>
     // I is ForwardIterator
    @@ -484,7 +358,7 @@ 

    Sort from merge

    It’s a good algorithm. It’s stable. It uses no extra storage. -Whether, that’s really needed or not, we don’t know. +Whether that’s really needed or not, we don’t know. It has log(n) levels. At every level we have a merge which is O(n log(n)). so the overall complexity is O(n log^2(n)). @@ -518,12 +392,16 @@

    Code


    1. -When I first put std::rotate in STL -it returned void. +“On a Stable Minimum Storage Merging Algorithm”. Information Processing Letters. 1981.
    2. +
    3. +See exercise 5.5.3 (Pg. 390) in Volume 3 of “The Art of Computer Programming”. +The algorithm itself and the attribution to Vaughan Pratt is in the solution (Pg. 701).
    4. +
    5. +Alex: When I first put std::rotate in STL it returned void. In 1995 I discovered what it should return and how to do it efficiently. When you rotate you return what the new middle is. -It took literally 20 years.
    6. -
    7. +It took literally 20 years.
    8. +
    9. If you are not familiar with recursion or mathematical induction this kind of code can look tricky. The key property is that every iteration the problem space is made smaller, @@ -531,7 +409,7 @@

      Code

      In this case, we assume we can sort both halves. That wouldn’t solve the problem normally, except that the merge algorithm gives us a way to combine them -to get a solution for the full input space.

    10. +to get a solution for the full input space.

    diff --git a/15_merge_inplace.md b/15_merge_inplace.md index a3dbe83..a637044 100644 --- a/15_merge_inplace.md +++ b/15_merge_inplace.md @@ -10,63 +10,54 @@ Many people worked on it. The first solution was done by a Russian computer scientist [Alexander Kronrod][kronrod], but it wasn't stable. -There was another solution -by a wonderful Argentinian computer scientist -[Luis Pardo][pardo] (Knuth's student). - -So, was thinking and thinking and thinking and once I was -waking up in the phase between Twilight, when you wake up but still see -dreams but you're not quite asleep. -I saw the algorithm on the board, it does happen. +There was another solution by a wonderful Argentinian computer scientist [Luis Pardo][pardo] (Knuth's student). + +So, I was thinking and thinking and thinking and once I was waking up in the phase between Twilight, +when you wake up but still see dreams but you're not quite asleep. +I saw the algorithm on the board, it does happen. I was ecstatic. I think it was 1984. What do you do if you find a really beautiful algorithm? If you're me, you call your friends. -So I called [Dave Musser][musser] -and said, "this is absolutely gorgeous" -and he agreed. +So I called [Dave Musser][musser] and said, "this is absolutely gorgeous" and he agreed. I started implementing and doing measurements. Meanwhile, he starts telling faculty around him. -Here comes the bad news. One of his colleagues, -[Erich Kaltofen][kaltofen] who is a distinguished -specialist in computer algebra, -came to Dave and said "Yeah it's nice, -but there are these two Polish guys -Dudzinski and Dydek -who [published it](papers/on-stable-merge.pdf) in 1981 (two years before)." +Here comes the bad news. +One of his colleagues, [Erich Kaltofen][kaltofen] who is a distinguished specialist in computer algebra, came to Dave and said +"Yeah it's nice, but there are these two Polish guys Dudzinski and Dydek who published it in 1981[^on-a-stable-minimum-storage-merge] (two years before)." It was very sad. But, we often reinvent things. +Meanwhile Knuth heard about it from his friend [Vaughan Pratt][pratt] who told him about it. +So he gives the attribution to his friend, and not the two polish guys[^merge-sort-exercise]. -Meanwhile Knuth heard about it from his friend -[Ian Pratt][pratt] who told him about it. -So he gives the attribution to his friend, -and not the two polish guys. As an algorithm itself, it's utterly useless. -Sometimes algorithms published by theoreticians -can be used by us after appropriate modification. +But sometimes algorithms published by theoreticians can be used by us, after appropriate modification. + +[^merge-sort-exercise]: See exercise 5.5.3 (Pg. 390) in Volume 3 of "The Art of Computer Programming". + The algorithm itself and the attribution to Vaughan Pratt is in the solution (Pg. 701). + +[^on-a-stable-minimum-storage-merge]: ["On a Stable Minimum Storage Merging Algorithm"](papers/on-stable-merge.pdf). Information Processing Letters. 1981. [kronrod]: https://en.wikipedia.org/wiki/Alexander_Kronrod [pardo]: https://www.genealogy.math.ndsu.nodak.edu/id.php?id=47194 [musser]: https://en.wikipedia.org/wiki/David_Musser [kaltofen]: https://kaltofen.math.ncsu.edu/ -[pratt]: https://en.wikipedia.org/wiki/Ian_Pratt_(computer_scientist) +[pratt]: https://en.wikipedia.org/wiki/Vaughan_Pratt ## Merging adjacent lists -We got into all of this because we were -thinking about merge. -We already learned to merge linked list. +We got into all of this because we were thinking about merge. +We already learned to merge linked list (chapter 12). I introduced this thing called `goto`. Today I'm going to teach you some other bad programming practices because it's always pleasant to introduce techniques which are known to be bad in general when they're good in particular. -We're going to to go about things in a funny way. +We're going to go about things in a funny way. Normally when I teach merge, -we then realize it can be used for sorting, -we write merge sort, -and then we realize that merge sort needs extra memory, +we then realize it can be used for sorting. +We write merge sort and then we realize that merge sort needs extra memory, and say "oh it's very unfortunate. Couldn't we find merge sort that doesn't require extra memory?" This time around, I decided to follow a different path. @@ -76,7 +67,7 @@ Sometimes it's actually good to start with slow code and refine it into fast code. Especially if it allows you to create something architecturally nice and see connections. -So we are going to look at the much harder problem of in-place merge +So we are going to look at the much harder problem of in-place merge. The problem is actually hard if you think about it. ### Interface @@ -98,7 +89,6 @@ For bounded (not counted) it will take three iterators: But our algorithm will greatly benefit from counted ranges. When do we need counted ranges? When we do bisection, or something like binary search. -I like to use counted ranges. We will use two of them: [first1, count1) @@ -114,11 +104,8 @@ If you don't know how to do something, the old advice is to look for divide and conquer. The following is a graph of the two sorted ranges adjacent to each other. The graph for each moves up and to the right to illustrate they are ascending. -When we have lots of variables, -naming doesn't work. -We have to use one letter names with indices, -like math. - +When we have lots of variables naming doesn't work. +We have to use one letter names with indices, like math. n0 n1 / / @@ -127,18 +114,15 @@ like math. / / f0 f1 -We will first bisect one of the ranges and pick a guy -from the middle. -Then we ask, "where would it fit in the other sequence?". +We will first bisect one of the ranges and pick a guy from the middle. +Then we ask, "where would it fit in the other sequence?" Do we have a function for that? We do. It's called `lower_bound`. Assume we bisect the left. -So then we let `f0_0 = f0`, -and `f0_1` be the bisection -of the first interval. +Then let `f0_0 = f0` and `f0_1` be the bisection of the first interval. Then `f1_1` is found from the right using `lower_bound`. -(that should take `O(log(n))` comparisons). +(that should take `O(log(n))` comparisons). n0 n1 / / @@ -148,11 +132,11 @@ Then `f1_1` is found from the right using `lower_bound`. / / f0_0 f1 -Now we are going to rotate (C++11 [`std::rotate`][cpp-rotate][^rotate]). +Now we are going to rotate. Rotating swaps elements in the range `[f0_1, f1_1)` in such a way that `f1` becomes the first. -`[f0_0, f0_1)` and `[f1_1, n1)` remain fixed. +`[f0_0, f0_1)` and `[f1_1, n1)` remain fixed (see [`std::rotate`][cpp-rotate]). n0 n1 / / @@ -163,12 +147,10 @@ in such a way that `f1` becomes the first. / / f0_0 f1 -Now we know `f0_1` is in his rightful place, +Now we know `f0_1` is in his rightful place (all lower are to the left and all greater are to the right), so we will let `x = f0_1`, -as he won't be moved again (all lower are to the left) -(all greater are to the right). -So, we shrink the range -by one step by assigning `f1_0 = f0_1 + 1`, +as he won't be moved again +So, we shrink the range by one step by assigning `f1_0 = f0_1 + 1`, and then just figure out the lengths of the remaining intervals: @@ -181,7 +163,6 @@ intervals: / / f0_0 f0_1 - Now notice that we have 4 adjacent sequences, each of which is individually sorted, so we can apply the algorithm recursively. @@ -196,12 +177,6 @@ Why? To preserve stability we need to make sure equal guys don't jump over each other. -[^rotate]: When I first put `std::rotate` in STL - it returned `void`. - In 1995 I discovered what it should return and how to do it efficiently. - When you rotate you return what the new middle is. - It took literally 20 years. - [cpp-rotate]: https://en.cppreference.com/w/cpp/algorithm/rotate @@ -243,6 +218,9 @@ and return 4 of them. n1_1 = n1 - n0_1; } +Note that this `std::rotate` is the C++11 version which returns an iterator +rather than `void`[^rotate-return]. + Now implement the right sub-problem, it is basically the same idea. The fundamental difference is upper bound, @@ -308,6 +286,11 @@ Now we combine them in a function that does no work. merge_inplace_n(f1_0, n1_0, f1_1, n1_1, r); } +[^rotate-return]: Alex: When I first put `std::rotate` in STL it returned `void`. + In 1995 I discovered what it should return and how to do it efficiently. + When you rotate you return what the new middle is. + It took literally 20 years. + ### Naming things There is a function in STL incorrectly called [`std::inplace_merge`][cpp-inplace] @@ -325,7 +308,7 @@ I guess, I did not see it. I'm just emphasizing a very important naming principle which is to think about indexing. -think about indexing things you you know like +think about indexing things you know like `find_if`, not `if_find`. Suffixes should be sorted in the order of importance. @@ -358,7 +341,7 @@ Nobody does any work except `merge_inplace_n`[^inductive-algorithm]. It's a good algorithm. It's stable. It uses no extra storage. -Whether, that's really needed or not, we don't know. +Whether that's really needed or not, we don't know. It has `log(n)` levels. At every level we have a merge which is `O(n log(n))`. so the overall complexity is `O(n log^2(n))`. diff --git a/16_optimizing_stable_sort.html b/16_optimizing_stable_sort.html index 37b6064..c33a8cc 100644 --- a/16_optimizing_stable_sort.html +++ b/16_optimizing_stable_sort.html @@ -2,116 +2,9 @@ + 16. Optimizing stable sort - + @@ -207,12 +100,12 @@

    How good is our stable sort?

    The numbers are nano-seconds per element. Our merge sort is called inplace. We start off ok, and then gradually get slower and slower in comparison. -Obviously, as the size increase you have to do more work. +Obviously, as the size increases you have to do more work. It’s not linear. But, ours is clearly 3-4x slower than that.

    - -

    A plan for improvement.

    + +

    A plan for improvement

    We are going to write a faster one. I am going to work with the class but I will not always tell you @@ -251,7 +144,7 @@

    Requirements

    Within 5-10% of the std::stable_sort. What would be great? Observe STL also has std::sort because it’s faster. - If it’s as fast as sort2.

    + If it’s as fast as sort, that would be great.2.

    @@ -294,8 +187,8 @@

    Ideas to explore

    O(n log(n)). What is the specific coefficient for insertion sort? The sorted portion is on average half the length of the original input. -In addition, on average when we an insert an element we -only have to go half the length before finding it’s location. +In addition, on average when we insert an element we +only have to go half the length before finding its location. Therefore the complexity is roughly: n^2/4.

    @@ -336,9 +229,9 @@

    Ideas to explore

    The term is adaptive.

    You might think everybody can write a good merge. -If you Google “std merge` it shows you this: +If you Google “std merge” it shows you this: Since you’re a normal programmer -you migth say, "oh it’s on the web, therefore I can copy and paste and use +you might say, “oh it’s on the Web, therefore I can copy and paste and use it in my code.”

    template <class InputIterator1, class InputIterator2, class OutputIterator>
    @@ -362,45 +255,41 @@ 

    Ideas to explore

    What is the complexity of std::rotate? It’s tricky, because it depends on the kind of iterators which you have. With RandomAccessIterator the best theoretical algorithm -does n + gcd(n1, n2) assignments, not swaps. +does n + gcd(n_1, n_2) assignments, not swaps. On average GCD is small, but larger than one. -So we can get almost to n assignments, which is a lot better -than n swaps. -For ForwardIterators it happens to be n - gcd(n1, n2) swaps. -It is roughly n for bidirectional iterators.

    +So we can get almost to n assignments, which is a lot better than n swaps. +For ForwardIterator it happens to be n - gcd(n_1, n_2) swaps. +It is roughly n swaps for BidirectionalIterator (3n assignments).

    -

    As we will observe, we can use a faster rotate than the rotate in STL -because we have this additional storage. -If you want to rotate and you have enough storage, -then you only need n + n1 assignments, -which is for sure less than 3n.

    +

    As we will observe, we can use a faster rotate than the rotate in STL because we have this additional storage. +If you want to rotate and you have enough storage, then you only need n + n_1 assignments which is for sure less than 3n.

    First steps

    There’s lots of things to do. How should we go about it? -The problem the problem with programming, specifically designing components -and decomposing the system is that you do not know what is right in isolation. +The problem with programming, specifically designing components +and composing the system, is that you do not know what is right in isolation. You never know what the correct interface is until you see it in other algorithms, and you see how those are used. -This is why you just have to try things, and ideas start emerging. -You might think it’s an infinite process. +This is why you just have to try things and ideas start emerging.

    + +

    You might think it’s an infinite process. No it’s not infinite, that’s the wonderful thing about life. It sort of terminates (I cannot prove it of course). -In practice if you start fitting things together you sort of discover what -you need to return, +In practice if you start fitting things together you sort of discover what you need to return, what you need to pass, what is the right thing to do, and that’s what I am trying to teach.

    -

    When should we try insertion point? -As a rule we want to fix the asmyptotic complexity. +

    When should we try insertion sort? +As a rule we want to fix the asymptotic complexity. Doing insertion sort at the bottom won’t help that. Right now we have a problem with our asymptotic complexity. -It’s O(n log^2(n)) we want to get rid of that square -as fast as possible.

    +It’s O(n log^2(n)). +We want to get rid of that square as fast as possible.

    I’m very lazy. So, we saw how fast we can get using no memory. @@ -414,7 +303,7 @@

    First steps

    AMD Ryzen 5 2400G (8 core, 3.6 GHz). GCC 9.3.0
  • Alex: It has been my dream for decades -to make stable sort as fast as sort, +to make my stable sort as fast as my sort, at which point I could throw away the other and just have one sort. But, I make progress with stable sort, then I make progress with sort.
  • diff --git a/16_optimizing_stable_sort.md b/16_optimizing_stable_sort.md index 7e38735..3673973 100644 --- a/16_optimizing_stable_sort.md +++ b/16_optimizing_stable_sort.md @@ -76,7 +76,7 @@ So it tells us we can do better. The numbers are nano-seconds per element. Our merge sort is called `inplace`. We start off ok, and then gradually get slower and slower in comparison. -Obviously, as the size increase you have to do more work. +Obviously, as the size increases you have to do more work. It's not linear. But, ours is clearly 3-4x slower than that. @@ -84,7 +84,7 @@ But, ours is clearly 3-4x slower than that. [cpp-stable-sort]: https://en.cppreference.com/w/cpp/algorithm/stable_sort -## A plan for improvement. +## A plan for improvement We are going to write a faster one. I am going to work with the class but I will not always tell you @@ -124,10 +124,10 @@ so we can remove them later if needed. Within 5-10% of the `std::stable_sort`. What would be great? Observe STL also has [`std::sort`][cpp-sort] because it's faster. - If it's as fast as sort[^dream]. + If it's as fast as sort, that would be great.[^dream]. [^dream]: Alex: It has been my dream for decades - to make stable sort as fast as sort, + to make my stable sort as fast as my sort, at which point I could throw away the other and just have one sort. But, I make progress with stable sort, then I make progress with sort. @@ -170,8 +170,8 @@ What is the complexity of merge sort, in terms of comparisons? `O(n log(n))`. What is the specific coefficient for insertion sort? The sorted portion is on average half the length of the original input. -In addition, on average when we an insert an element we -only have to go half the length before finding it's location. +In addition, on average when we insert an element we +only have to go half the length before finding its location. Therefore the complexity is roughly: `n^2/4`. @@ -212,9 +212,9 @@ otherwise do that". The term is *adaptive*. You might think everybody can write a good merge. -If you Google "std merge` it shows you [this][merge-code]: +If you Google "std merge" it shows you [this][merge-code]: Since you're a normal programmer -you migth say, "oh it's on the web, therefore I can copy and paste and use +you might say, "oh it's on the Web, therefore I can copy and paste and use it in my code." template @@ -236,50 +236,43 @@ An empty range will melt the computer. **Faster rotate** - What is the complexity of `std::rotate`? It's tricky, because it depends on the kind of iterators which you have. With `RandomAccessIterator` the best theoretical algorithm -does `n + gcd(n1, n2)` assignments, not swaps. +does `n + gcd(n_1, n_2)` assignments, not swaps. On average GCD is small, but larger than one. -So we can get almost to `n` assignments, which is a lot better -than `n` swaps. -For ForwardIterators it happens to be `n - gcd(n1, n2)` swaps. -It is roughly `n` for bidirectional iterators. - -As we will observe, we can use a faster rotate than the rotate in STL -because we have this additional storage. -If you want to rotate and you have enough storage, -then you only need `n + n1` assignments, -which is for sure less than `3n`. +So we can get almost to `n` assignments, which is a lot better than `n` swaps. +For `ForwardIterator` it happens to be `n - gcd(n_1, n_2)` swaps. +It is roughly `n` swaps for `BidirectionalIterator` (`3n` assignments). +As we will observe, we can use a faster rotate than the rotate in STL because we have this additional storage. +If you want to rotate and you have enough storage, then you only need `n + n_1` assignments which is for sure less than `3n`. ### First steps There's lots of things to do. How should we go about it? -The problem the problem with programming, specifically designing components -and decomposing the system is that you do not know what is right in isolation. +The problem with programming, specifically designing components +and composing the system, is that you do not know what is right in isolation. You never know what the correct interface is until you see it in other algorithms, and you see how those are used. -This is why you just have to try things, and ideas start emerging. +This is why you just have to try things and ideas start emerging. + You might think it's an infinite process. No it's not infinite, that's the wonderful thing about life. It sort of terminates (I cannot prove it of course). -In practice if you start fitting things together you sort of discover what -you need to return, +In practice if you start fitting things together you sort of discover what you need to return, what you need to pass, what is the right thing to do, and that's what I am trying to teach. - -When should we try insertion point? -As a rule we want to fix the asmyptotic complexity. +When should we try insertion sort? +As a rule we want to fix the asymptotic complexity. Doing insertion sort at the bottom won't help that. Right now we have a problem with our asymptotic complexity. -It's `O(n log^2(n))` we want to get rid of that square -as fast as possible. +It's `O(n log^2(n))`. +We want to get rid of that square as fast as possible. I'm very lazy. So, we saw how fast we can get using no memory. diff --git a/17_adaptive_merge_sort.html b/17_adaptive_merge_sort.html index 2160971..8a826d0 100644 --- a/17_adaptive_merge_sort.html +++ b/17_adaptive_merge_sort.html @@ -2,116 +2,9 @@ + 17. Adaptive merge sort - + @@ -141,8 +34,7 @@

    “temporary” buffers in STL

    But, it is vendor specific, you cannot do it as a client. There is actually no call in UNIX which tells you how much physical memory you have, how much is used, it’s just impossible. -But, I needed to ship it, and in order to do that, I couldn’t just -require them to add a hook. +But, I needed to ship it, and in order to do that, I couldn’t just require them to add a hook. So I wrote the following thing:

    // ask for n, system gives you as much as it can, but not more than n.
    @@ -179,8 +71,7 @@ 

    “temporary” buffers in STL

    It’s useful for remapping things3. But, it is a figment of imagination. It does not exist. -As Seymour Cray used to say, “You cant’t simulate what you do -not have”4. +As Seymour Cray used to say, “you can’t simulate what you do not have”4. If your algorithm working set doesn’t fit into physical memory, it will not just thrash, your program will not terminate, because your memory starts working at the speed of a disk. @@ -212,11 +103,11 @@

    Merge with buffer

    }
    -

    Even though we aren’t worry about it now, we can see the buffer +

    Even though we aren’t worried about it now, we can see the buffer will need to be big enough to copy the entire left half in, so about size n/2.

    -

    Note that the buffer doens’t have to match the type of the container. +

    Note that the buffer doesn’t have to match the type of the container. We will probably use an array for buffer, but I could be an iterator for a linked list. This is a general principle, relax type requirements.

    @@ -245,7 +136,7 @@

    Merge with buffer

    Note we put the buffer argument at the end, because we are extending the interface of the previous sort.

    -

    Now to use it in our framework we need a more convienent interface. +

    Now to use it in our framework we need a more convenient interface. We have too many parameters, so we need to somehow get rid of all of them. We write a wrapper.

    @@ -397,22 +288,32 @@

    Code


    1. -malloc returns null when it fails to allocate, -so this function uses that as an indicator that the requested buffer -was too large, and continues attempting smaller and smaller buffers.
    2. +malloc returns NULL when it fails to allocate of the requested size. +Alex’s get_temporary_buffer function uses that as an indicator that the requested buffer was too large +and continues attempting smaller and smaller buffers.
    3. -Virtual memory let’s you allocate more memory -than is available. Based on my testing on Linux, -the kernel will let you malloc about the total physical memory size. -So for any reasonable amount a program asks for -it will just return malloc(n). -Try his code out on your machine.
    4. +

      Virtual memory allows programs to allocate more memory than is physically available +by saving and loading portions of memory to disk as needed. +When memory is fully utilized the system starts working slower rather than simply crashing.

      + +

      Even though the total amount of virtual memory available on a system is very large, +individual memory allocations are typically limited. +For example, when testing this code on Linux, the system only +allows a program to allocate a buffer up to the total physical memory size.

      + +

      What this means is that Alex’s implementation of get_temporary_buffer is not useful. +It is equivalent to malloc(n) for anything but extremely large allocations.

      + +

      Exercise: Experiment with get_temporary_buffer on your machine. How large of an allocation will it give you?

    5. -Memory mapping is one such application. -It allows you to read and write to a file with pointers -as if it was loaded into memory. -I have seen Alex use it in his code before. -See mmap(2) for details.
    6. +

      Memory mapping files is a very useful application of virtual memory. +When a program wants to interact with a file on disk it can instead request that +the system map it to a range in memory. +The file can then be manipulated by reading and writing to pointers as if it was a buffer instead of a file. +In other words, the program can interact with the file, just like other data. +See mmap(2) for details.

      + +

      Alex has used memory mapped files in his own code.

    7. I cannot find a reference to this quotation.
    8. diff --git a/17_adaptive_merge_sort.md b/17_adaptive_merge_sort.md index daf37e7..8905b43 100644 --- a/17_adaptive_merge_sort.md +++ b/17_adaptive_merge_sort.md @@ -11,11 +11,9 @@ It's the only outside hook which STL will use. But, it is vendor specific, you cannot do it as a client. There is actually no call in UNIX which tells you how much physical memory you have, how much is used, it's just impossible. -But, I needed to ship it, and in order to do that, I couldn't just -require them to add a hook. +But, I needed to ship it, and in order to do that, I couldn't just require them to add a hook. So I wrote the following thing: - // ask for n, system gives you as much as it can, but not more than n. std::pair get_temporary_buffer(size_t n) { // this is bogus code and needs to be replaced @@ -27,7 +25,7 @@ So I wrote the following thing: return std::make_pair(buffer, n); } -So it binary searches for a buffer small enough to fit[^malloc-info]. +So it binary searches for a buffer small enough to fit[^malloc-fail-return]. Is it a useful piece of code? No. But, I had to ship. Guess what happened after that. @@ -49,8 +47,7 @@ There is virtual memory but virtual memory is actually useless unless it's backe It's useful for remapping things[^memory-map]. But, it is a figment of imagination. It does not exist. -As [Seymour Cray][cray] used to say, "You cant't simulate what you do -not have"[^ref-request]. +As [Seymour Cray][cray] used to say, "you can't simulate what you do not have"[^cray-ref-request]. If your algorithm working set doesn't fit into physical memory, it will not just [thrash][thrash], your program will not terminate, because your memory starts working at the speed of a disk. @@ -58,25 +55,35 @@ That's not good enough. It just shows you how imperfect life is. -[^malloc-info]: `malloc` returns `null` when it fails to allocate, - so this function uses that as an indicator that the requested buffer - was too large, and continues attempting smaller and smaller buffers. +[^malloc-fail-return]: `malloc` returns `NULL` when it fails to allocate of the requested size. + Alex's `get_temporary_buffer` function uses that as an indicator that the requested buffer was too large + and continues attempting smaller and smaller buffers. + +[^cray-ref-request]: I cannot find a reference to this quotation. + +[^virtual-memory]: Virtual memory allows programs to allocate more memory than is physically available + by saving and loading portions of memory to disk as needed. + When memory is fully utilized the system starts working slower rather than simply crashing. -[^ref-request]: I cannot find a reference to this quotation. + Even though the total amount of virtual memory available on a system is very large, + individual memory allocations are typically limited. + For example, when testing this code on Linux, the system only + allows a program to allocate a buffer up to the total physical memory size. -[^virtual-memory]: Virtual memory let's you allocate more memory - than is available. Based on my testing on Linux, - the kernel will let you malloc about the total physical memory size. - So for any reasonable amount a program asks for - it will just return `malloc(n)`. - Try his code out on your machine. + What this means is that Alex's implementation of `get_temporary_buffer` is not useful. + It is equivalent to `malloc(n)` for anything but extremely large allocations. -[^memory-map]: Memory mapping is one such application. - It allows you to read and write to a file with pointers - as if it was loaded into memory. - I have seen Alex use it in his code before. + **Exercise:** Experiment with `get_temporary_buffer` on your machine. How large of an allocation will it give you? + +[^memory-map]: Memory mapping files is a very useful application of virtual memory. + When a program wants to interact with a file on disk it can instead request that + the system map it to a range in memory. + The file can then be manipulated by reading and writing to pointers as if it was a buffer instead of a file. + In other words, the program can interact with the file, just like other data. See [mmap(2)](https://man7.org/linux/man-pages/man2/mmap.2.html) for details. + Alex has used memory mapped files in his own code. + [cray]: https://en.wikipedia.org/wiki/Seymour_Cray [thrash]: https://en.wikipedia.org/wiki/Thrashing_(computer_science) [virtual-memory]: https://en.wikipedia.org/wiki/Virtual_memory @@ -104,11 +111,11 @@ but later we will write a better one. } -Even though we aren't worry about it now, we can see the buffer +Even though we aren't worried about it now, we can see the buffer will need to be big enough to copy the entire left half in, so about size `n/2`. -Note that the buffer doens't have to match the type of the container. +Note that the buffer doesn't have to match the type of the container. We will probably use an array for buffer, but `I` could be an iterator for a linked list. This is a general principle, relax type requirements. @@ -139,7 +146,7 @@ No because it is recursive. Note we put the buffer argument at the end, because we are extending the interface of the previous sort. -Now to use it in our framework we need a more convienent interface. +Now to use it in our framework we need a more convenient interface. We have too many parameters, so we need to somehow get rid of all of them. We write a wrapper. @@ -280,4 +287,3 @@ but we are also using about 10x less memory. - [test_temp_buffer.cpp](code/test_temp_buffer.cpp) - diff --git a/18_binary_insertion_sort.html b/18_binary_insertion_sort.html index 1ac11a9..b38d6ea 100644 --- a/18_binary_insertion_sort.html +++ b/18_binary_insertion_sort.html @@ -2,116 +2,9 @@ + 18. Binary insertion sort - + @@ -138,9 +31,9 @@

      The Organ Grinder

      I’ll just occasionally share a song or something like that which would indicate what mood I’m in. This is a very great song by Franz Schubert but it also -perfectly reflects will go on with the course in how I feel. -The song is called “The Organ Grinder” (Der Liermann). -The singer is Dietrich fischer-dieskau maybe the greatest +perfectly reflects what will go on with the course in how I feel. +The song is called “The Organ Grinder” (Der Leiermann). +The singer is Dietrich Fischer-Dieskau maybe the greatest leader, or art song singer of the last 50, 60, or 70 years. He started singing in the late forties. Let us spend a couple of minutes and listen to it… (Video here)

      @@ -154,21 +47,21 @@

      Strategy

      First we will review the basic idea of algorithm. Always start with a picture:

      -
      |   sorted piece |  unsorted piece |
      +
      | sorted piece | unsorted piece |
       

      We start with an empty range on the left which is the sorted portion. We basically want to grow it, one element at a time, while ensuring it stays sorted. By repeating it inductively, eventually the whole range is sorted. -So, main idea is to pick an element in the unsorted piece, - find where the element goes, and insert it there.

      +So, the main idea is to pick an element in the unsorted piece, +find where the element goes, and insert it there.

      Insertion sort variations

      How many algorithmic versions of insertion sort are there? -Finding where it could go be done with either:

      +Finding where it should go could be done with either:

      1. Linear search
      2. @@ -176,18 +69,17 @@

        Insertion sort variations

      -

      There is another version which was invented, -as was everything else, by Tony Hoare. +

      There is another version which was invented (as everything else was) by Tony Hoare. He realized that in the inner loop of insertion sort you have to do two things:

        -
      1. You have to guard that you’re not crossing by size
      2. +
      3. Guard that you’re not crossing by size
      4. Guard that you’re not crossing the first
      -

      This makes the insertion sort do two comparisons per cycle. +

      This makes the insertion sort do two comparisons per cycle. You could have an insertion sort with a guard, assume that somebody puts (by hook or by crook) the smallest element first. @@ -206,16 +98,16 @@

      When is insertion sort useful?

    9. We already talked about when n is small. How small? We already proved it was when n = 16. Is it the exact? No, it’s not. -But, it’s a good rule of thumb.

    10. +But it’s a good rule of thumb.

    11. If we just have a few things -to add to a sorted list, that would be good. -In other words, most of the list is sorted, +to add to a sorted list that would be good. +In other words, most of the list is sorted but 16 or so elements are out of order.

    12. Insertion sort is going to move an element -from where it is, to where it should be, +from where it is to where it should be, one step at a time. So another case is when the average distance -from where it is, to where it should be, +from where it is to where it should be is small. It’s “nearly sorted”.

    @@ -223,7 +115,7 @@

    When is insertion sort useful?

    There are some considerations where you want to look at the relative cost but they are not important for asymptotic assessment. A quadratic algorithm, -regardless of the ratio between move and compare +regardless of the ratio between move and compare, is still a quadratic algorithm.

    @@ -232,13 +124,13 @@

    Naming insertion sort function

    Unfortunately, STL does not have insertion sort. Should it? Yes, it should. -But, they threw it out from the public library1. +But they threw it out from the public library1. At least put it in your library. It might not be called insertion sort. Maybe we should call it something else. What’s a good name? This is not a bogus question. -Finding a good name is important, +Finding a good name is important because we want to lead people to use it when these three conditions are met. Maybe, sort_almost_sorted. @@ -250,20 +142,18 @@

    Naming insertion sort function

    Naming is extremely hard, but very important. The goal is to name components so people can actually understand -what they mean, it helps people. +what they mean. It helps people. We have to discuss nomenclature. Respectable sciences spend most of their time discussing nomenclature. Chemists, physicists, they know what to call things. It’s only computer science that doesn’t.

    -

    I have to tell you a story Sean Parent -shared with me. +

    I have to tell you a story Sean Parent shared with me. When STL was introduced, people at Apple decided to try it. They tried it and found it absolutely unacceptable because they replaced their list with STL std::list and everything became extremely slow. -The problem is their list -was what is still called a vector. +The problem is their list was what is still called a vector. They didn’t realize linked lists are called “linked lists”. It sort of works, you know, slowly.

    @@ -311,9 +201,9 @@

    Binary insertion sort

    but we have to be careful. You might want to use the upper_bound we wrote together. But, remember it calls std::distance which is linear for ForwardIterator. -So let’s use upper_bound_n

    +So let’s use upper_bound_n.

    -

    What we will first write a function for finding where an element +

    What we will first write is a function for finding where an element goes and placing it there. Then we will structure our loop3 around that.

    @@ -372,38 +262,36 @@

    Rotate

    Rotate for bidirectional iterators

    -

    Once we find where it goes, how do we make room for it? +

    Once we find where it goes (the element to insert), how do we make room for it? We “rotate” to the right by one. -If it was a bidirectional iterator there is a beautiful algorithm. -Copy is the wrong thing, because it will overwrite everything -with the same value. -What we want is copying from the back. +If it is a bidirectional iterator there is a beautiful algorithm, copying from the back. The algorithm is called: std::copy_backward.

    template <typename I>
     // I is BidirectionalIterator
     void rotate_right_by_one(I first, I last, std::bidirectional_iterator_tag) {
       typedef typename std::iterator_traits<I>::value_type T;
    -  T butlast = last;
    +  I butlast = last;
       --butlast;
    +  T x = *butlast; 
       std::copy_backward(first, butlast, last);
       *first = x;
     }
     
    +

    Note that forward copy is the wrong thing, because it will overwrite everything with the same value (namely the first value).

    +

    Rotate for forward iterators

    -

    For forward iterator we have to shift all the elements up, +

    For ForwardIterator we have to shift all the elements up, we move one out of the way, to make room, and continue up the array until we find an empty place to put it.

    I think the problem is quite instructive not just because -it’s a useful algorithm, which it is, but because of the method for -deriving it. +it’s a useful algorithm, which it is, but because of the method for deriving it. Before coding, let us do a bit of mathematics. -You can always “haircut” -code, but remember mathematics? +You can always “haircut” code, but remember mathematics? I used to talk about it before they told me to switch to programming. Deriving mathematically is a good thing.

    @@ -414,22 +302,21 @@

    Rotate for forward iterators

    how do we do it? Done. That allows us to consider an inductive solution. -Somehow, by hook or crook we have an algorithm which knows how to shift things -n things, +Somehow, by hook or by crook, we have an algorithm which knows how to shift n things, such as the range:

    -
    a_{0}, ..., a_{n-1}
    +
    a_{0} ... a_{n-1}
     

    Then the question is, how could we get an algorithm for

    -
    a_{0}, ..., a_{n-1}, a_{n}
    +
    a_{0} ... a_{n-1} a_{n}
     

    How do we add one additional element? After the shift the first n elements (leaving a_{n} fixed) we have:

    -
    a_{n-1}, a_{0}, ... a_{n-2}, a_{n} 
    +
    a_{n-1} a_{0} ... a_{n-2} a_{n} 
     

    What do we need to do to solve the problem? @@ -467,7 +354,7 @@

    Rotate for forward iterators

    void rotate_right_by_one(I first, I last, std::forward_iterator_tag) { if (first == last) return; I current = first; - while (++current != last) std::swap(first, current); + while (++current != last) std::swap(*first, *current); }
    @@ -476,8 +363,8 @@

    Rotate for forward iterators

    template <typename I>
     inline
    -void rotate_right_by_one(I first, I butlast, I last) {
    -  rotate_right_by_one(first, butlast, last, typename std::iterator_traits<I>::iterator_category());
    +void rotate_right_by_one(I first, I last) {
    +  rotate_right_by_one(first, last, typename std::iterator_traits<I>::iterator_category());
     }
     
    @@ -489,8 +376,7 @@

    Should we support forward iterator?

    because if we have something like a linked list, we don’t need to rotate or shift elements around, we can just insert it where it belongs. -That’s a good idea, but maybe some measurements -will show us otherwise. +That’s a good idea, but maybe some measurements will show us otherwise. We already implemented optimal linked list sort. Later we need to compare whether it’s actually faster to use our list sort, or to use our method @@ -498,19 +384,17 @@

    Should we support forward iterator?

    Why do you think I say that? List sort destroys locality. -If at every cdr (next), you get to a different cache +If at every cdr (next) you get to a different cache line, that’s a problem. -In our sort we constantly re-link -next, so eventually you get to a point where everything is scattered -all over memory.

    +In our sort we constantly re-link next, +so eventually you get to a point where everything is scattered all over memory.

    STL used to have a sentence in the container -section which the standard commitee threw out. +section which the standard committee threw out. Use a vector. This is a true statement. Unless you are absolutely positive -you need something else, use a vector -or a C array6.

    +you need something else, use a vector or a C array6.

    Code

    @@ -527,7 +411,7 @@

    Code

    Alex: Of course, STL still has insertion sort on the inside. It has to. What happened during the standardization process, -is they took something which was in the library and was used by the library, and threw it out. +is they took something which was in the library and was used by the library and threw it out. The argument was, “we already have too many sorts”. Is it a good argument? No, you need to have as many sorts as people might need @@ -537,12 +421,11 @@

    Code

    1. std::sort the fastest sort.
    2. std::stable_sort, this is merge sort, the one we are trying to write.
    3. -
    4. std::partial_sort sort the first thousand, out of a million, - something you frequently do in search engines.
    5. +
    6. std::partial_sort sort the first thousand, out of a million (something you frequently do in search engines).
    7. std::nth_element. Not quite a sort, but it’s sort related. What it does is pin, for example the 30th percentile - element, and put all the smaller before, adn all the larger. + element, and put all the smaller before, and then all the larger. If I want to find another one, I can pin again, and sort inbetween, etc.
    @@ -564,16 +447,16 @@

    Code

    Alex: Someday we will get concepts in the C++ standard and not have to write these things. But that will be at least 5 years and I won’t be programming. -I’m like an old man planting an Apple tree. +I’m like an old man planting an apple tree.
  • Alex: I actually drop this requirement in STL and require -RandomAccessIterator, for all the sorts. -It wasn’t the standard committee’s fault just me. +RandomAccessIterator for all the sorts. +It wasn’t the standard committee’s fault, just me. I am not sure if I agree with myself.

    The reasoning went like so. -Most people of course, don’t know anything. +Most people of course don’t know anything. Therefore if you give them things which sort ForwardIterators, they will attempt to use them on things like linked lists. @@ -584,8 +467,8 @@

    Code

    I was making decisions by saying, “I know how to do it in the more general case. But, I will not let programmers do it because -they are immature.”. -This “nanny” control is not necessarily a good thing. +they are immature.” +This nanny control is not necessarily a good thing. I am of two minds here. I am trying to not be a nanny here. I’m trying to show you the spectrum. @@ -594,12 +477,12 @@

    Code

    It probably will never happen in your life. But, it just might for at least one of you.

  • -Alex: Because they threw it out, people like -Herb Sutter used to recommend to the world +Alex: Because they threw out that instruction, people like +Herb Sutter use to recommend to the world to use std::deque (see “Using Vector and Deque”). I’m not making it up. -He thought, that it’s better because it supports more operations. -He was wrong, I wrote both std::vector and std::deque.
  • +He thought that it’s better because it supports more operations. +He was wrong and I wrote both std::vector and std::deque. diff --git a/18_binary_insertion_sort.md b/18_binary_insertion_sort.md index e19d710..b1a90e4 100644 --- a/18_binary_insertion_sort.md +++ b/18_binary_insertion_sort.md @@ -8,9 +8,9 @@ I used to tell you stories but right now I decided I'll just occasionally share a song or something like that which would indicate what mood I'm in. This is a very great song by [Franz Schubert][schubert] but it also -perfectly reflects will go on with the course in how I feel. -The song is called ["The Organ Grinder"][winter-journey] (Der Liermann). -The singer is [Dietrich fischer-dieskau][dietrich] maybe the greatest +perfectly reflects what will go on with the course in how I feel. +The song is called ["The Organ Grinder"][winter-journey] (Der Leiermann). +The singer is [Dietrich Fischer-Dieskau][dietrich] maybe the greatest leader, or art song singer of the last 50, 60, or 70 years. He started singing in the late forties. Let us spend a couple of minutes and listen to it... ([Video here][organ-grinder]) @@ -28,32 +28,31 @@ you ever wanted to know. First we will review the basic idea of algorithm. Always start with a picture: - | sorted piece | unsorted piece | + | sorted piece | unsorted piece | We start with an empty range on the left which is the sorted portion. We basically want to grow it, one element at a time, while ensuring it stays sorted. By repeating it inductively, eventually the whole range is sorted. -So, main idea is to pick an element in the unsorted piece, - find where the element goes, and insert it there. +So, the main idea is to pick an element in the unsorted piece, +find where the element goes, and insert it there. ### Insertion sort variations How many algorithmic versions of insertion sort are there? -Finding where it could go be done with either: +Finding where it should go could be done with either: 1. Linear search 2. Binary search -There is another version which was invented, -as was everything else, by [Tony Hoare][hoare]. +There is another version which was invented (as everything else was) by [Tony Hoare][hoare]. He realized that in the inner loop of insertion sort you have to do two things: -1. You have to guard that you're not crossing by size +1. Guard that you're not crossing by size 2. Guard that you're not crossing the first -This makes the insertion sort do two comparisons per cycle. +This makes the insertion sort do two comparisons per cycle. You could have an insertion sort with a guard, assume that somebody puts (by hook or by crook) the smallest element first. @@ -72,24 +71,24 @@ This is an interesting point we should discuss. 1. We already talked about when `n` is small. How small? We already proved it was when `n = 16`. Is it the exact? No, it's not. - But, it's a good rule of thumb. + But it's a good rule of thumb. 2. If we just have a few things - to add to a sorted list, that would be good. - In other words, most of the list is sorted, + to add to a sorted list that would be good. + In other words, most of the list is sorted but 16 or so elements are out of order. 3. Insertion sort is going to move an element - from where it is, to where it should be, - one step at a time. - So another case is when the average distance - from where it is, to where it should be, + from where it is to where it should be, + one step at a time. + So another case is when the average distance + from where it is to where it should be is small. It's "nearly sorted". There are some considerations where you want to look at the relative cost but they are not important for asymptotic assessment. A quadratic algorithm, -regardless of the ratio between move and compare +regardless of the ratio between move and compare, is still a quadratic algorithm. ### Naming insertion sort function @@ -97,13 +96,13 @@ is still a quadratic algorithm. Unfortunately, STL does not have insertion sort. Should it? Yes, it should. -But, they threw it out from the public library[^sorts-in-stl]. +But they threw it out from the public library[^sorts-in-stl]. At least put it in your library. It might not be called insertion sort. Maybe we should call it something else. What's a good name? This is not a bogus question. -Finding a good name is important, +Finding a good name is important because we want to lead people to use it when these three conditions are met. Maybe, `sort_almost_sorted`. @@ -115,31 +114,29 @@ So instead we will settle on `binary_insertion_sort`. Naming is extremely hard, but very important. The goal is to name components so people can actually understand -what they mean, it helps people. +what they mean. It helps people. We have to discuss nomenclature. Respectable sciences spend most of their time discussing nomenclature. Chemists, physicists, they know what to call things. It's only computer science that doesn't. -I have to tell you a story [Sean Parent][parent] -shared with me. +I have to tell you a story [Sean Parent][sean-parent] shared with me. When STL was introduced, people at Apple decided to try it. They tried it and found it absolutely unacceptable because they replaced their `list` with STL `std::list` and everything became extremely slow. -The problem is their list -was what is still called a vector. +The problem is their list was what is still called a vector. They didn't realize linked lists are called "linked lists". It sort of works, you know, slowly. [insertion-sort]: https://en.wikipedia.org/wiki/Insertion_sort -[parent]: https://sean-parent.stlab.cc/papers-and-presentations/ +[sean-parent]: https://sean-parent.stlab.cc/papers-and-presentations/ [^sorts-in-stl]: Alex: Of course, STL still has insertion sort on the inside. It has to. What happened during the standardization process, - is they took something which was in the library and was used by the library, and threw it out. + is they took something which was in the library and was used by the library and threw it out. The argument was, "we already have too many sorts". Is it a good argument? No, you need to have as many sorts as people might need @@ -148,12 +145,11 @@ It sort of works, you know, slowly. 1. [`std::sort`](https://en.cppreference.com/w/cpp/algorithm/sort) the fastest sort. 2. [`std::stable_sort`](https://en.cppreference.com/w/cpp/algorithm/stable_sort), this is merge sort, the one we are trying to write. - 3. [`std::partial_sort`](https://en.cppreference.com/w/cpp/algorithm/partial_sort) sort the first thousand, out of a million, - something you frequently do in search engines. + 3. [`std::partial_sort`](https://en.cppreference.com/w/cpp/algorithm/partial_sort) sort the first thousand, out of a million (something you frequently do in search engines). 4. [`std::nth_element`](https://en.cppreference.com/w/cpp/algorithm/nth_element). Not quite a sort, but it's sort related. What it does is pin, for example the 30th percentile - element, and put all the smaller before, adn all the larger. + element, and put all the smaller before, and then all the larger. If I want to find another one, I can pin again, and sort inbetween, etc. ## Binary insertion sort @@ -182,7 +178,6 @@ algorithms. return last; } - (Recall, that we proved 16 was a good cutoff.) The standard C convention for old people is that `ALL_CAPS` means it's a macro. We will use this for a constant here[^macros-comment]. @@ -199,9 +194,9 @@ Using `ForwardIterator` is actually a piece of cake for the binary search, but we have to be careful. You might want to use the `upper_bound` we wrote together. But, remember it calls `std::distance` which is linear for `ForwardIterator`. -So let's use `upper_bound_n` +So let's use `upper_bound_n`. -What we will first write a function for finding where an element +What we will first write is a function for finding where an element goes and placing it there. Then we will structure our loop[^for-loop] around that. @@ -221,7 +216,6 @@ Then we will structure our loop[^for-loop] around that. It's important to return here, in case someone else wants to use this function. - To write the loop that calls this function, I suggest that we carefully write invariants. We have the range: @@ -235,7 +229,6 @@ What we want is: That's the invariant on which we rely. - template // I is ForwardIterator // N is Integral @@ -270,37 +263,34 @@ That's the invariant on which we rely. ### Rotate for bidirectional iterators -Once we find where it goes, how do we make room for it? +Once we find where it goes (the element to insert), how do we make room for it? We "rotate" to the right by one. -If it was a bidirectional iterator there is a beautiful algorithm. -Copy is the wrong thing, because it will overwrite everything -with the same value. -What we want is copying from the back. +If it is a bidirectional iterator there is a beautiful algorithm, copying from the back. The algorithm is called: [`std::copy_backward`][cpp-copy-back]. template // I is BidirectionalIterator void rotate_right_by_one(I first, I last, std::bidirectional_iterator_tag) { typedef typename std::iterator_traits::value_type T; - T butlast = last; + I butlast = last; --butlast; + T x = *butlast; std::copy_backward(first, butlast, last); *first = x; } +Note that forward copy is the wrong thing, because it will overwrite everything with the same value (namely the first value). ### Rotate for forward iterators -For forward iterator we have to shift all the elements up, +For `ForwardIterator` we have to shift all the elements up, we move one out of the way, to make room, and continue up the array until we find an empty place to put it. I think the problem is quite instructive not just because -it's a useful algorithm, which it is, but because of the method for -deriving it. +it's a useful algorithm, which it is, but because of the method for deriving it. Before coding, let us do a bit of mathematics. -You can always "haircut" -code, but remember mathematics? +You can always "haircut" code, but remember mathematics? I used to talk about it before they told me to switch to programming. Deriving mathematically is a good thing. @@ -311,20 +301,19 @@ If we have a one-element sequence `a_0` and we want to rotate it, how do we do it? Done. That allows us to consider an inductive solution. -Somehow, by hook or crook we have an algorithm which knows how to shift things -`n` things, +Somehow, by hook or by crook, we have an algorithm which knows how to shift `n` things, such as the range: - a_{0}, ..., a_{n-1} + a_{0} ... a_{n-1} Then the question is, how could we get an algorithm for - a_{0}, ..., a_{n-1}, a_{n} + a_{0} ... a_{n-1} a_{n} How do we add one additional element? After the shift the first `n` elements (leaving `a_{n}` fixed) we have: - a_{n-1}, a_{0}, ... a_{n-2}, a_{n} + a_{n-1} a_{0} ... a_{n-2} a_{n} What do we need to do to solve the problem? Just swap `a_{n-1}` and `a_{n}`. @@ -343,6 +332,7 @@ swap the first element, and the one following our range: [2 1] 3 + Now we have the first two rotated. To rotate the full sequence we once again swap the first and last: @@ -357,24 +347,23 @@ It might not be the fastest, but it is going to be much more elegant. void rotate_right_by_one(I first, I last, std::forward_iterator_tag) { if (first == last) return; I current = first; - while (++current != last) std::swap(first, current); + while (++current != last) std::swap(*first, *current); } - Let's write a dispatch for both versions, -it will compile to no code[^concepts]. +it will compile to no code[^concepts-in-standard-soon]. template inline - void rotate_right_by_one(I first, I butlast, I last) { - rotate_right_by_one(first, butlast, last, typename std::iterator_traits::iterator_category()); + void rotate_right_by_one(I first, I last) { + rotate_right_by_one(first, last, typename std::iterator_traits::iterator_category()); } -[^concepts]: Alex: Someday we will get concepts +[^concepts-in-standard-soon]: Alex: Someday we will get concepts in the C++ standard and not have to write these things. But that will be at least 5 years and I won't be programming. - I'm like an old man planting an Apple tree. + I'm like an old man planting an apple tree. ### Should we support forward iterator? @@ -383,8 +372,7 @@ really make sense for this algorithm[^stl-forward-iterators], because if we have something like a linked list, we don't need to rotate or shift elements around, we can just insert it where it belongs. -That's a good idea, but maybe some measurements -will show us otherwise. +That's a good idea, but maybe some measurements will show us otherwise. We already implemented optimal linked list sort. Later we need to compare whether it's actually faster to use our list sort, or to use our method @@ -392,37 +380,31 @@ we develop. Why do you think I say that? List sort destroys locality. -If at every `cdr` (next), you get to a different cache +If at every `cdr` (next) you get to a different cache line, that's a problem. -In our sort we constantly re-link -next, so eventually you get to a point where everything is scattered -all over memory. +In our sort we constantly re-link next, +so eventually you get to a point where everything is scattered all over memory. STL used to have a sentence in the container -section which the standard commitee threw out. +section which the standard committee threw out. Use a vector. This is a true statement. Unless you are absolutely positive -you need something else, use a vector -or a C array[^sutter-advice]. - - - +you need something else, use a vector or a C array[^sutter-deque]. [^for-loop]: Alex: Could we use a `for` loop instead of a `while`? Yes, but I hate `for` loops. Why? Because the semantics have changed about 6 times, since I started C++, `while` loops never changed. - [^stl-forward-iterators]: Alex: I actually drop this requirement in STL and require - `RandomAccessIterator`, for all the sorts. - It wasn't the standard committee's fault just me. + `RandomAccessIterator` for all the sorts. + It wasn't the standard committee's fault, just me. I am not sure if I agree with myself. The reasoning went like so. - Most people of course, don't know anything. + Most people of course don't know anything. Therefore if you give them things which sort `ForwardIterator`s, they will attempt to use them on things like linked lists. @@ -433,8 +415,8 @@ or a C array[^sutter-advice]. I was making decisions by saying, "I know how to do it in the more general case. But, I will not let programmers do it because - they are immature.". - This "nanny" control is not necessarily a good thing. + they are immature." + This nanny control is not necessarily a good thing. I am of two minds here. I am trying to not be a nanny here. I'm trying to show you the spectrum. @@ -443,14 +425,14 @@ or a C array[^sutter-advice]. It probably will never happen in your life. But, it just might for at least one of you. -[^sutter-advice]: Alex: Because they threw it out, people like - [Herb Sutter][sutter] used to recommend to the world - to use [`std::deque`][cpp-deque] (see ["Using Vector and Deque"][vector-and-deque]). +[^sutter-deque]: Alex: Because they threw out that instruction, people like + [Herb Sutter][sutter] use to recommend to the world + to use [`std::deque`][cpp-deque] (see ["Using Vector and Deque"][sutter-vector-and-deque]). I'm not making it up. - He thought, that it's better because it supports more operations. - He was wrong, I wrote both `std::vector` and `std::deque`. + He thought that it's better because it supports more operations. + He was wrong and I wrote both `std::vector` and `std::deque`. -[vector-and-deque]: http://www.gotw.ca/gotw/054.htm +[sutter-vector-and-deque]: http://www.gotw.ca/gotw/054.htm [cpp-copy-back]: https://en.cppreference.com/w/cpp/algorithm/copy_backward [cpp-deque]: https://en.cppreference.com/w/cpp/container/deque [sutter]: https://en.wikipedia.org/wiki/Herb_Sutter diff --git a/19_insertion_sort.html b/19_insertion_sort.html index f96e0e0..c1401a3 100644 --- a/19_insertion_sort.html +++ b/19_insertion_sort.html @@ -2,116 +2,9 @@ + 19. Linear insertion sort - + @@ -133,7 +26,7 @@

    19. Linear insertion sort

    Thank you, noble art.

    -

    Last time we started with “The Organ Grinder” (Der Liermann) +

    Last time we started with “The Organ Grinder” (Der Leiermann) which explains how it feels to stand outside in the cold with an empty tray being barked at by stray dogs. That’s how it feels when I start here (joke).

    @@ -150,7 +43,7 @@

    Thank you, noble art.

    She is also a beautiful woman. Let us proceed… (Video here).

    -

    Shuebert was 18 years old when he composed it. +

    Schubert was 18 years old when he composed it. If I had a choice between founding Facebook or writing this song, as my lifetime accomplishment, I would not hesitate one second, and it’s not going to be the Facebook. @@ -176,11 +69,11 @@

    Linear insertion sort

    It’s a very important component to be used in multiple places. For example, when implementing quicksort. Plus, we will be able to investigate -several interesting techniques and discover one deep problem on this very +several interesting techniques and discover one deep problem on this very simple sorting algorithm. Binary search is not always a good thing to do.

    -

    Remember insertion sort is good when things are almost sorted. +

    Remember insertion sort is good when things are almost sorted. They are pretty close to where they should be. Binary search is going to poke far away. It’s going to do a little poking, logarithmic poking, @@ -199,7 +92,7 @@

    Linear insert

    We find the main loop invariant. In order to be able to do the right thing, we need to see what we need to accomplish.

    -

    Our goal (like binary insert) is to +

    Our goal, like binary insert, is to insert an element into the portion at the front of the range. We first copy the element value out to insert, @@ -213,20 +106,19 @@

    Linear insert

    When are we allowed to move the hole? -What is the condition? -The first is that hole != first. -If this happens, we cannot move any further. -The other condition is:

    +What are the conditions?

    + +
      +
    1. hole != first. If this happens, we cannot move any further.
    2. +
    3. value < prev(hold).
    4. +
    -
    prev(hole) < value
    -

    If both hold, we continue to move left. Eventually, one of the conditions will not hold. -We can even prove it. +We can even prove it: There are only finitely many things in the range, -so after so many iterations it will be exhausted. -These termination proofs are not usually very profound.

    +so after so many iterations it will be exhausted (these termination proofs are not usually very profound).

    In our code we will call hole, current:

    @@ -246,9 +138,10 @@

    Linear insert

    }
    -

    When first == current at the start, it will swap. -Would a check be better? -As we have talked about it before, this is a case that seldom happens +

    When first == current at the start, it will copy *current to a temporary +variable and put it right back. +Would a check be better to avoid this? +As we have talked about before, this is a case that seldom happens whereas adding an explicit check would slow down every other case.

    Of course, we need to define predecessor:

    @@ -259,8 +152,8 @@

    Linear insert

    I predecessor(I x) { return --x; }
    - -

    Insertion sort

    + +

    Traditional insertion sort

    Now linear insertion sort is about identical to binary insertion sort, we just use linear_insert, instead of binary_insert.

    @@ -286,7 +179,7 @@

    Insertion sort

    Let’s write the version for bounded ranges. It’s very easy. As a base case in our induction, an empty range, and a one -element range are both sorted.

    +element range, are both sorted.

    // I is BidirectionalIterator
     // R is WeakStrictOrdering on the value type of I 
    @@ -301,23 +194,23 @@ 

    Insertion sort

    }
    - -

    Sentinel version

    + +

    Sentinel insertion sort

    I think we can optimize it further. You might argue we don’t need to. But, let me tell you one of the most humiliating times in my life. -I was giving a talk at Stanford and a certain professor walked into the talk -and when I showed my code, roughly like what we just wrote. -It was a little different, but same idea. -I said, it’s obviously optimal. +I was giving a talk at Stanford. +A certain professor walked into the talk +and when I showed my code, roughly like what we just wrote (it was a little different, but same idea), I said, “it’s obviously optimal”. This professor interrupted -and said, no it’s not. +and said, “no it’s not”. The trouble is his name was Donald Knuth. -When somebody like Don Knuth says that your code is not optimal you are in a difficult situation. +When somebody like Don Knuth says that your code is not optimal, +you are in a difficult situation. His argument was that we do this conditional check:

    -
    while (first != current ...)
    +
    first != current
     

    n times, when we don’t need to. @@ -326,8 +219,8 @@

    Sentinel version

    to make an effort to use them more.

    This is a valid argument. -We are not here to impose some theoretical conditions on algorithms we are here to -take existing efficient algorithms and find how to express them. +We are not here to impose some theoretical conditions on algorithms. +We are here to take existing efficient algorithms and find how to express them. We have to write whatever we write to enable this code to work. What this means is that we sometimes have to reject or ignore other notions of “good software engineering” in order to get @@ -371,8 +264,8 @@

    Sentinel version

    }
    - -

    Insertion sort in quicksort

    + +

    Application to quicksort

    Now we need to write a new insertion sort which guarantees this condition. @@ -383,20 +276,20 @@

    Insertion sort in quicksort

    A long time ago the person who invented it, Tony Hoare observed that quicksort becomes inefficient towards the end of recursion, when you start doing partitions of very small things. -He suggested that we run insertion sort, down there at the lower +He suggested that we run insertion sort down there, at the lower levels of recursion. -Originally, people thought they would just wait for the range -to get small, and then call insertion sort every time. +Originally people thought they would go down recursively and call insertion +sort every time we reach a small subrange. But, Hoare observed you really don’t need to call insertion sort many times. You can just stop the recursion when quicksort sorts things up to a certain size, -and then run one pass of insertion sort, over the whole thing. -Remember, we talked about how insertion sort is good when things are -almost where they should be.

    +and then, at the end, run one pass of insertion sort, over the whole thing. +Because when things are almost where they should be, insertion sort is effective. +Quicksort can guarantee that eventually everyone will be within some threshold of their final destination.

    Let’s assume we are sorting a million elements. After we are done with quicksort we have a threshold, -say k, and all the elements are partitioned -into blocks, or sub ranges of this size.

    +say k, and all the elements are partitioned. +The first partition will be somewhere within the first k elements,

    [      |            ...            ]
     ^      ^                        
    @@ -413,13 +306,16 @@ 

    Insertion sort in quicksort

    [ 1, 3, 5 | 2, 11, 17 ... ]
     
    -

    How can we have 2 on the right side? We don’t know exactly -where quicksort stopped, just that it’s in some threshold. -5 is not a sentinel for the right side, because 2 is smaller. +

    5 is not a sentinel for the right side, because 2 is smaller. 1 is the sentinel. -Now that is quicksort, but we will design our components in a way to support it.

    - -

    Let’s write a function that assumes we have a prefix that is sorted +The line drawn in that range is not necessarily a quicksort partition. +If it were, we couldn’t have 2 on the right side. +We have absolutely no idea where the real quicksort partitions are. +But, we know that there is a partition boundary within some threshold +(say left of the line).

    + +

    Now that is quicksort, but we will design our components in a way to support it. +Let’s write a function that assumes we have a prefix that is sorted and contains a sentinel.

    template <typename I, typename R>
    @@ -439,6 +335,9 @@ 

    Insertion sort in quicksort

    }
    + +

    Optimized insertion sort

    +

    So let us copy linear_insertion_sort and make our definitive insertion_sort.

    template <typename I, typename R>
    @@ -450,7 +349,8 @@ 

    Insertion sort in quicksort

    ++current; if (current == last) return; // create a sentinel - rotate_right_by_one(first, ++std::min_element(first, last, r)); + I min = std::min_element(first, last, r); + rotate_right_by_one(first, ++min); insertion_sort_suffix(current, last, r); }
    @@ -467,7 +367,8 @@

    Insertion sort in quicksort

    ++current; if (current == last) return; // create a sentinel - std::swap(*first, *std::min_element(first, last, r)); + I min = std::min_element(first, last, r); + std::swap(*first, *min); insertion_sort_suffix(current, last, r); }
    @@ -476,13 +377,11 @@

    Insertion sort in quicksort

    Selection sort

    This leads us to another classical sort, -it’s very slow, but since it’s classical and takes only a few lines -we will discovery it.

    +it’s very slow, but since it’s classical and takes only a few lines we will discover it.

    What’s the idea of selection sort? You find the min, put him in the first place. -You find the min of the remaining range, put him in the next place, -and so on. +You find the min of the remaining range, put him in the next place, and so on. Could we write it?

    template <typename I, typename R>
    @@ -490,34 +389,36 @@ 

    Selection sort

    // R is WeakStrictOrdering on the value type of I void selection_sort(I first, I last, R r) { while (first != last) { - std::swap(*first, *std::min_element(first, last, r)); + I min = std::min_element(first, last, r); + std::swap(*first, *min); ++first; } }

    It’s not stable, but it’s not hard to fix. -The problem is swap might skip over lot’s of equal guys.

    +The problem is std::swap might skip over lots of equal guys.

    template <typename I, typename R>
     // I is ForwardIterator
     // R is WeakStrictOrdering on the value type of I 
     void stable_selection_sort(I first, I last, R r) {
       while (first != last) {
    -    rotate_right_by_one(first, ++std::min_element(first, last, r));
    +    I min = std::min_element(first, last, r);
    +    rotate_right_by_one(first, ++min);
         ++first;
       }
     }
     
    -

    Comparison is typically fast, but swap we tend to think of as slow. -Imagine they are big buildings that need to be carried. -The unstable is actually amazing in that it just does n - 1 swaps, -always. -Merge sort, quicksort, the do like n log(n). +

    Comparison is typically fast, but swap we tend to think of as slow. +Imagine the elements are buildings that need to be lifted up and carried to swap places. +The unstable selection_sort is actually amazing in that it just does n - 1 swaps, always. +Merge sort, quicksort, they do like n log(n) swaps. Is it practically important? No. -Not once have I needed it. -However, it shows us how to create a sentinel.

    +Not once have I needed selection sort. +So why do I talk about it? +It shows us how to create a sentinel.

    Preconditions are essential

    @@ -530,62 +431,53 @@

    Preconditions are essential

    It randomly generates true or false. Then take std::sort and pass this function to it, because it will obviously create a randomly shuffled thing. -Low and behold to everybody’s amazement it caused -segmentation fault. -There were messages throughout Google saying STL is -totally broken. +Lo and behold to everybody’s amazement it caused a segmentation fault. +There were messages throughout Google saying, “STL is totally broken”. Obviously, because it brings Google down. Let’s argue why he shouldn’t do what he did.

      -
    1. There is an algorithm in stl called std::random_shuffle. +

    2. There is an algorithm in STL called std::random_shuffle. Why not use that?

    3. -
    4. Somebody more advanced, would say, even if it worked, it wouldn’t -be a uniform random shuffle. -It is screwed up, but it requires probability theory or Knuth. +

    5. Somebody more advanced, would say, even if it worked, it wouldn’t be a uniform random shuffle1. +What he did is screwed up. +But knowing that requires probability theory or Knuth. These people at Google just don’t read. - The brightest people do not need to read (joke).

    6. -
    7. My dear friend Dave Musser who was on sabbatical at Google ventured to post -that he did not satisfy the preconditions. - Random true/false is not a weak strict ordering, -or any ordering whatsoever. - He tried to explain, but they said no. - It should work with whatever I pass to it.

    8. +The brightest people do not need to read (joke).

      +
    9. My dear friend Dave Musser who was on sabbatical at Google ventured to post that he did not satisfy the preconditions. +Randomly returning true or false is not a weak strict ordering, or any ordering whatsoever. +He tried to explain, but they said no. +It should work with whatever I pass to it.

    -

    As you can imagine, we cannot rely on any properties, like sentinel with this -going on. +

    As you can imagine, we cannot rely on any properties, like sentinel with this going on. For a while there were a bunch of postings on the internet saying, do not use std::sort because it requires WeakStrictOrdering. It’s provably the weakest possible requirement. I thought it was good, but they turned it around and said no. Use std::stable_sort. -I still see this in code, people use it when they don’t need -stability because they read these discussions.

    +I still see this in code, people use it when they don’t need stability because they read these discussions.

    -

    Apparently it is an expectation of a modern -programmer that you don’t have to satisfy any precondition. +

    Apparently it is an expectation of a modern programmer that you don’t have to satisfy any precondition. Things should do something and never cause a segmentation fault. It is a tricky thing. Nowadays I wonder. What should we do when we build components? -Should we assume that we build them the fastest way and carefully -specify preconditions? -Or should we build idiot-proof or Google quality components. +Should we assume that we build them the fastest way and carefully specify preconditions? +Or should we build idiot-proof (Google quality) components? This is a difficult question. I do not know the answer.

    Final project

    -

    Write the fastest version of stable sort that you can, utilizing the ideas -in these tools. +

    Write the fastest version of stable sort that you can, utilizing the ideas in these tools. We have all the algorithmic ideas we need. But, I invite you to read books. I invite you to test and measure. If you want you can even go read old STL code I wrote. -It’s a competition, consider teaming up and sharing ideas.

    +It’s a competition! Consider teaming up and sharing ideas.

    Code

    @@ -595,6 +487,18 @@

    Code

  • test_insertion_sort.cpp
  • +
    +
    +
      +
    1. +A uniform shuffle of a range of elements x1 ... xn +is an algorithm which produces a random permutation of the elements, +in a manner such that all possible permutations are equally likely. +Since there are n! permutations, each permutation should occur +with probability 1/n! (See “The Art of Computer Programming” 3.4.2).
    2. +
    +
    + [ diff --git a/19_insertion_sort.md b/19_insertion_sort.md index ceccecc..e29931a 100644 --- a/19_insertion_sort.md +++ b/19_insertion_sort.md @@ -3,7 +3,7 @@ ## Thank you, noble art. -Last time we started with ["The Organ Grinder"][organ-grinder] (Der Liermann) +Last time we started with ["The Organ Grinder"][organ-grinder] (Der Leiermann) which explains how it feels to stand outside in the cold with an empty tray being barked at by stray dogs. That's how it feels when I start here (joke). @@ -20,8 +20,7 @@ In this video the words will be first introduced by a great English pianist [Ger She is also a beautiful woman. Let us proceed... ([Video here][to-music]). - -Shuebert was 18 years old when he composed it. +Schubert was 18 years old when he composed it. *If I had a choice between founding Facebook or writing this song, as my lifetime accomplishment, I would not hesitate one second, and it's not going to be the Facebook*. @@ -53,11 +52,11 @@ very good algorithm and we'll need it later in the course. It's a very important component to be used in multiple places. For example, when implementing quicksort. Plus, we will be able to investigate -several interesting techniques and discover one deep problem on this very +several interesting techniques and discover one deep problem on this very simple sorting algorithm. Binary search is not always a good thing to do. -Remember insertion sort is good when things are almost sorted. +Remember insertion sort is good when things are almost sorted. They are pretty close to where they should be. Binary search is going to poke far away. It's going to do a little poking, logarithmic poking, @@ -75,7 +74,7 @@ How do you make code beautiful? We find the main loop invariant. In order to be able to do the right thing, we need to see what we need to accomplish. -Our goal (like binary insert) is to +Our goal, like binary insert, is to insert an element into the portion at the front of the range. We first copy the element `value` out to insert, @@ -88,23 +87,19 @@ as far left as possible (requiring `BidirectionalIterator`). first hole When are we allowed to move the hole? -What is the condition? -The first is that `hole != first`. -If this happens, we cannot move any further. -The other condition is: +What are the conditions? - prev(hole) < value +1. `hole != first`. If this happens, we cannot move any further. +2. `value < prev(hold)`. If both hold, we continue to move left. Eventually, one of the conditions will not hold. -We can even prove it. +We can even prove it: There are only finitely many things in the range, -so after so many iterations it will be exhausted. -These termination proofs are not usually very profound. +so after so many iterations it will be exhausted (these termination proofs are not usually very profound). In our code we will call `hole`, `current`: - template // I is BidirectionalIterator // R is WeakStrictOrdering on the value type of I @@ -120,9 +115,10 @@ In our code we will call `hole`, `current`: return current; } -When `first == current` at the start, it will swap. -Would a check be better? -As we have talked about it before, this is a case that seldom happens +When `first == current` at the start, it will copy `*current` to a temporary +variable and put it right back. +Would a check be better to avoid this? +As we have talked about before, this is a case that seldom happens whereas adding an explicit check would slow down every other case. Of course, we need to define predecessor: @@ -132,12 +128,11 @@ Of course, we need to define predecessor: inline I predecessor(I x) { return --x; } -### Insertion sort +### Traditional insertion sort Now linear insertion sort is about identical to binary insertion sort, we just use `linear_insert`, instead of `binary_insert`. - template // I is BidirectionalIterator // N is Integral @@ -158,7 +153,7 @@ we just use `linear_insert`, instead of `binary_insert`. Let's write the version for bounded ranges. It's very easy. As a base case in our induction, an empty range, and a one -element range are both sorted. +element range, are both sorted. // I is BidirectionalIterator // R is WeakStrictOrdering on the value type of I @@ -173,22 +168,22 @@ element range are both sorted. } -## Sentinel version +## Sentinel insertion sort I think we can optimize it further. You might argue we don't need to. But, let me tell you one of the most humiliating times in my life. -I was giving a talk at [Stanford][stanford] and a certain professor walked into the talk -and when I showed my code, roughly like what we just wrote. -It was a little different, but same idea. -I said, it's obviously optimal. +I was giving a talk at [Stanford][stanford]. +A certain professor walked into the talk +and when I showed my code, roughly like what we just wrote (it was a little different, but same idea), I said, "it's obviously optimal". This professor interrupted -and said, no it's not. +and said, "no it's not". The trouble is his name was Donald Knuth. -When somebody like Don Knuth says that your code is not optimal you are in a difficult situation. +When somebody like Don Knuth says that your code is not optimal, +you are in a difficult situation. His argument was that we do this conditional check: - while (first != current ...) + first != current `n` times, when we don't need to. If you're really into performance you have to put a [sentinel][sentinel] in the back. @@ -196,8 +191,8 @@ I was using sentinels before, but from that point on I decided to make an effort to use them more. This is a valid argument. -We are not here to impose some theoretical conditions on algorithms we are here to -take existing efficient algorithms and find how to express them. +We are not here to impose some theoretical conditions on algorithms. +We are here to take existing efficient algorithms and find how to express them. We have to write whatever we write to enable this code to work. What this means is that we sometimes have to reject or ignore other notions of "good software engineering" in order to get @@ -217,7 +212,6 @@ and let's rewrite the precondition. // current is a valid iterator && // first != current && // !r(*current, *first) - Now we can remove the condition: @@ -242,7 +236,7 @@ Now we can remove the condition: [stanford]: https://en.wikipedia.org/wiki/Stanford_University -### Insertion sort in quicksort +### Application to quicksort Now we need to write a new insertion sort which guarantees this condition. @@ -253,20 +247,20 @@ Eventually we hope to study a very important algorithm called quicksort. A long time ago the person who invented it, [Tony Hoare][hoare] observed that quicksort becomes inefficient towards the end of recursion, when you start doing partitions of very small things. -He suggested that we run insertion sort, down there at the lower +He suggested that we run insertion sort down there, at the lower levels of recursion. -Originally, people thought they would just wait for the range -to get small, and then call insertion sort every time. +Originally people thought they would go down recursively and call insertion +sort every time we reach a small subrange. But, Hoare observed you really don't need to call insertion sort many times. You can just stop the recursion when quicksort sorts things up to a certain size, -and then run one pass of insertion sort, over the whole thing. -Remember, we talked about how insertion sort is good when things are -almost where they should be. +and then, at the end, run one pass of insertion sort, over the whole thing. +Because when things are almost where they should be, insertion sort is effective. +Quicksort can guarantee that eventually everyone will be within some threshold of their final destination. Let's assume we are sorting a million elements. After we are done with quicksort we have a threshold, -say `k`, and all the elements are partitioned -into blocks, or sub ranges of this size. +say `k`, and all the elements are partitioned. +The first partition will be somewhere within the first k elements, [ | ... ] ^ ^ @@ -281,12 +275,15 @@ For example, when we stop quicksort early we might have: [ 1, 3, 5 | 2, 11, 17 ... ] -How can we have 2 on the right side? We don't know exactly -where quicksort stopped, just that it's in some threshold. 5 is not a sentinel for the right side, because 2 is smaller. 1 is the sentinel. -Now that is quicksort, but we will design our components in a way to support it. +The line drawn in that range is not necessarily a quicksort partition. +If it were, we couldn't have 2 on the right side. +We have absolutely no idea where the real quicksort partitions are. +But, we know that there is a partition boundary within some threshold +(say left of the line). +Now that is quicksort, but we will design our components in a way to support it. Let's write a function that assumes we have a prefix that is sorted and contains a sentinel. @@ -306,6 +303,8 @@ and contains a sentinel. } } +### Optimized insertion sort + So let us copy `linear_insertion_sort` and make our definitive `insertion_sort`. template @@ -317,11 +316,11 @@ So let us copy `linear_insertion_sort` and make our definitive `insertion_sort`. ++current; if (current == last) return; // create a sentinel - rotate_right_by_one(first, ++std::min_element(first, last, r)); + I min = std::min_element(first, last, r); + rotate_right_by_one(first, ++min); insertion_sort_suffix(current, last, r); } - Copy paste, and make the unstable version by replacing rotate with swap. @@ -334,7 +333,8 @@ by replacing rotate with swap. ++current; if (current == last) return; // create a sentinel - std::swap(*first, *std::min_element(first, last, r)); + I min = std::min_element(first, last, r); + std::swap(*first, *min); insertion_sort_suffix(current, last, r); } @@ -344,13 +344,11 @@ by replacing rotate with swap. ## Selection sort This leads us to another classical sort, -it's very slow, but since it's classical and takes only a few lines -we will discovery it. +it's very slow, but since it's classical and takes only a few lines we will discover it. What's the idea of selection sort? You find the min, put him in the first place. -You find the min of the remaining range, put him in the next place, -and so on. +You find the min of the remaining range, put him in the next place, and so on. Could we write it? template @@ -358,33 +356,34 @@ Could we write it? // R is WeakStrictOrdering on the value type of I void selection_sort(I first, I last, R r) { while (first != last) { - std::swap(*first, *std::min_element(first, last, r)); + I min = std::min_element(first, last, r); + std::swap(*first, *min); ++first; } } It's not stable, but it's not hard to fix. -The problem is swap might skip over lot's of equal guys. +The problem is `std::swap` might skip over lots of equal guys. template // I is ForwardIterator // R is WeakStrictOrdering on the value type of I void stable_selection_sort(I first, I last, R r) { while (first != last) { - rotate_right_by_one(first, ++std::min_element(first, last, r)); + I min = std::min_element(first, last, r); + rotate_right_by_one(first, ++min); ++first; } } - -Comparison is typically fast, but swap we tend to think of as slow. -Imagine they are big buildings that need to be carried. -The unstable is actually amazing in that it just does `n - 1` swaps, -always. -Merge sort, quicksort, the do like `n log(n)`. +Comparison is typically fast, but `swap` we tend to think of as slow. +Imagine the elements are buildings that need to be lifted up and carried to swap places. +The unstable `selection_sort` is actually amazing in that it just does `n - 1` swaps, always. +Merge sort, quicksort, they do like `n log(n)` swaps. Is it practically important? No. -Not once have I needed it. -However, it shows us how to create a sentinel. +Not once have I needed selection sort. +So why do I talk about it? +It shows us how to create a sentinel. ## Preconditions are essential @@ -396,60 +395,51 @@ you implement a comparison function which throws a coin. It randomly generates true or false. Then take `std::sort` and pass this function to it, because it will obviously create a randomly shuffled thing. -Low and behold to everybody's amazement it caused -segmentation fault. -There were messages throughout Google saying STL is -totally broken. +Lo and behold to everybody's amazement it caused a segmentation fault. +There were messages throughout Google saying, "STL is totally broken". Obviously, because it brings Google down. Let's argue why he shouldn't do what he did. -1. There is an algorithm in stl called [`std::random_shuffle`][cpp-random-shuffle]. +1. There is an algorithm in STL called [`std::random_shuffle`][cpp-random-shuffle]. Why not use that? -2. Somebody more advanced, would say, even if it worked, it wouldn't - be a uniform random shuffle. - It is screwed up, but it requires probability theory or Knuth. +2. Somebody more advanced, would say, even if it worked, it wouldn't be a uniform random shuffle[^uniform-shuffle]. + What he did is screwed up. + But knowing that requires probability theory or Knuth. These people at Google just don't read. - The brightest people do not need to read (joke). + The brightest people do not *need* to read (joke). -3. My dear friend Dave Musser who was on sabbatical at Google ventured to post - that he did not satisfy the preconditions. - Random true/false is not a weak strict ordering, - or any ordering whatsoever. - He tried to explain, but they said no. - It should work with whatever I pass to it. +3. My dear friend Dave Musser who was on sabbatical at Google ventured to post that he did not satisfy the preconditions. + Randomly returning true or false is not a weak strict ordering, or any ordering whatsoever. + He tried to explain, but they said no. + It should work with whatever I pass to it. -As you can imagine, we cannot rely on any properties, like sentinel with this -going on. +As you can imagine, we cannot rely on any properties, like sentinel with this going on. For a while there were a bunch of postings on the internet saying, do not use `std::sort` because it requires `WeakStrictOrdering`. It's provably the weakest possible requirement. I thought it was good, but they turned it around and said no. Use `std::stable_sort`. -I still see this in code, people use it when they don't need -stability because they read these discussions. +I still see this in code, people use it when they don't need stability because they read these discussions. -Apparently it is an expectation of a modern -programmer that you don't have to satisfy any precondition. +Apparently it is an expectation of a modern programmer that you don't have to satisfy any precondition. Things should do something and never cause a segmentation fault. It is a tricky thing. Nowadays I wonder. What should we do when we build components? -Should we assume that we build them the fastest way and carefully -specify preconditions? -Or should we build idiot-proof or Google quality components. +Should we assume that we build them the fastest way and carefully specify preconditions? +Or should we build idiot-proof (Google quality) components? This is a difficult question. I do not know the answer. ## Final project -Write the fastest version of stable sort that you can, utilizing the ideas -in these tools. +Write the fastest version of stable sort that you can, utilizing the ideas in these tools. We have all the algorithmic ideas we need. But, I invite you to read books. I invite you to test and measure. If you want you can even go read old STL code I wrote. -It's a competition, consider teaming up and sharing ideas. +It's a competition! Consider teaming up and sharing ideas. ## Code @@ -459,4 +449,9 @@ It's a competition, consider teaming up and sharing ideas. [cpp-random-shuffle]: https://en.cppreference.com/w/cpp/algorithm/random_shuffle +[^uniform-shuffle]: A uniform shuffle of a range of elements `x1 ... xn` + is an algorithm which produces a random permutation of the elements, + in a manner such that all possible permutations are equally likely. + Since there are `n!` permutations, each permutation should occur + with probability `1/n!` (See "The Art of Computer Programming" 3.4.2). diff --git a/code/algorithm.h b/code/algorithm.h index 6c4d38f..90bd7e3 100644 --- a/code/algorithm.h +++ b/code/algorithm.h @@ -75,7 +75,7 @@ template inline typename std::iterator_traits::difference_type distance(I first, I last) { - // [first, n) is a valid range + // [first, last) is a valid range return distance(first, last, std::iterator_traits::iterator_category()); } @@ -85,7 +85,7 @@ template inline typename std::iterator_traits::difference_type distance(I first, I last, std::input_iterator_tag) { - // [first, n) is a valid range + // [first, last) is a valid range typename std::iterator_traits::difference_type n(0); while (first != last) { ++n; @@ -99,7 +99,7 @@ template inline typename std::iterator_traits::difference_type distance(I first, I last, std::random_access_iterator_tag) { - // [first, n) is a valid range + // [first, last) is a valid range return last - first; } diff --git a/code/binary_counter.h b/code/binary_counter.h index 723cd25..3a2e6b4 100644 --- a/code/binary_counter.h +++ b/code/binary_counter.h @@ -3,31 +3,6 @@ #include -template -class binary_counter -{ -private: - std::vector counter; - Op op; - T zero; - -public: - binary_counter(const Op& op, const T& zero) : - op(op), zero(zero) {} - - void reserve(size_t n) { counter.reserve(n); } - - void add(T x) { - x = add_to_counter(counter.begin(), counter.end(), op, zero, x); - if (x != zero) counter.push_back(x); - } - - // returns: value of the counter - T reduce() { - return reduce_counter(counter.begin(), counter.end(), op, zero); - } -}; - template // requires Op is BinaryOperation(T) // and Op is associative @@ -64,4 +39,39 @@ T reduce_counter(I first, I last, Op op, const T& zero) { } return result; } + +template +class binary_counter +{ +private: + Op op; + T zero; + std::vector counter; + +public: + binary_counter(const Op& op, const T& zero) : + op(op), zero(zero) {} + + void reserve(size_t n) { counter.reserve(n); } + + void add(T x) { + x = add_to_counter(counter.begin(), counter.end(), op, zero, x); + if (x != zero) counter.push_back(x); + } + + // returns: value of the counter + T reduce() { + return reduce_counter(counter.begin(), counter.end(), op, zero); + } + + // For debug. Not in the course. + typename std::vector::const_iterator begin() const { + return counter.begin(); + } + + typename std::vector::const_iterator end() const { + return counter.end(); + } +}; + #endif diff --git a/code/binary_insertion_sort.h b/code/binary_insertion_sort.h index bce838c..48c18e1 100644 --- a/code/binary_insertion_sort.h +++ b/code/binary_insertion_sort.h @@ -11,7 +11,7 @@ template void rotate_right_by_one(I first, I last, std::forward_iterator_tag) { if (first == last) return; I current = first; - while (++current != last) std::swap(first, current); + while (++current != last) std::swap(*first, *current); } template diff --git a/code/index.html b/code/index.html new file mode 100644 index 0000000..e69de29 diff --git a/code/insertion_sort.h b/code/insertion_sort.h index bc00613..5430af7 100644 --- a/code/insertion_sort.h +++ b/code/insertion_sort.h @@ -37,7 +37,8 @@ template // R is WeakStrictOrdering on the value type of I void selection_sort(I first, I last, R r) { while (first != last) { - std::swap(*first, *std::min_element(first, last, r)); + I min = std::min_element(first, last, r); + std::swap(*first, *min); ++first; } } @@ -47,7 +48,8 @@ template // R is WeakStrictOrdering on the value type of I void stable_selection_sort(I first, I last, R r) { while (first != last) { - rotate_right_by_one(first, ++std::min_element(first, last, r)); + I min = std::min_element(first, last, r); + rotate_right_by_one(first, ++min); ++first; } } @@ -96,7 +98,8 @@ void insertion_sort(I first, I last, R r) { ++current; if (current == last) return; // create a sentinel - rotate_right_by_one(first, ++std::min_element(first, last, r)); + I min = std::min_element(first, last, r); + rotate_right_by_one(first, ++min); insertion_sort_suffix(current, last, r); } @@ -109,7 +112,8 @@ void insertion_sort_unstable(I first, I last, R r) { ++current; if (current == last) return; // create a sentinel - std::swap(*first, *std::min_element(first, last, r)); + I min = std::min_element(first, last, r); + std::swap(*first, *min); insertion_sort_suffix(current, last, r); } diff --git a/code/list_pool_iterator.h b/code/list_pool_iterator.h index fc5327c..11ff62b 100644 --- a/code/list_pool_iterator.h +++ b/code/list_pool_iterator.h @@ -52,6 +52,7 @@ struct iterator { } // extend the interface to Singly Linked List Iterator: + // (this concept and these methods are not discussed in the course.) friend void push_front(iterator& x, const T& value) { diff --git a/code/minmax.h b/code/minmax.h index f365816..43e5f4a 100644 --- a/code/minmax.h +++ b/code/minmax.h @@ -6,9 +6,9 @@ template inline const T& min(const T& a, const T& b, Compare cmp) { if (cmp(b, a)) { - return a; + return b; } else { - return b; + return a; } } @@ -23,7 +23,7 @@ struct less { template const T& min(const T& a, const T& b) { - return min(a, b, std::less()); + return min(a, b, less()); } template @@ -37,6 +37,10 @@ const T& max(const T& a, const T& b, Compare cmp) { } } +template +const T& max(const T& a, const T& b) { + return max(a, b, less()); +} template // requires Compare is a StrictWeakOrdering on T diff --git a/code/search.h b/code/search.h index 8a86985..72d5132 100644 --- a/code/search.h +++ b/code/search.h @@ -25,7 +25,7 @@ template // value type of I is the same as argument type of P std::pair find_if_n(I first, N n, P pred) { while (n && !pred(*first)) {--n; ++first;} - return std::make_pair (first, n); + return std::make_pair(first, n); } template diff --git a/code/test_binary_counter.cpp b/code/test_binary_counter.cpp new file mode 100644 index 0000000..51c508d --- /dev/null +++ b/code/test_binary_counter.cpp @@ -0,0 +1,35 @@ +#include +#include +#include +#include +#include "binary_counter.h" + +template +struct min_op { + T operator()(const T& a, const T& b) { + return std::min(a, b); + } +}; + +int main() { + char letters[] = { 'C', 'H', 'B', 'F', 'I', 'D', 'E', 'G', 'A', 'J' }; + + char* first = letters; + char* last = letters + (sizeof(letters) / sizeof(char)); + + typedef min_op op_type; + + binary_counter counter(op_type(), '_'); + + while (first != last) { + std::cout << "add: " << *first << std::endl; + counter.add(*first); + + std::ostream_iterator out(std::cout, " "); + std::copy(counter.begin(), counter.end(), out); + std::cout << std::endl; + + ++first; + } + std::cout << "min (after reduce): " << counter.reduce() << std::endl; +} diff --git a/code/test_insertion_sort.cpp b/code/test_insertion_sort.cpp index 652893a..5b81929 100644 --- a/code/test_insertion_sort.cpp +++ b/code/test_insertion_sort.cpp @@ -1,21 +1,57 @@ #include #include +#include #include "insertion_sort.h" +#include "list_pool.h" #include "algorithm.h" int main() { + + // test binary insertion sort { int nums[] = {5, 3, 10, 1, 2}; - print_range(nums, nums + 5); binary_insertion_sort_n(nums, 5, std::less()); print_range(nums, nums + 5); + assert(is_sorted(nums, nums + 5)); } + // test linear insertion sort { int nums[] = {5, 3, 10, 1, 2}; - print_range(nums, nums + 5); linear_insertion_sort_n(nums, 5, std::less()); print_range(nums, nums + 5); + + assert(is_sorted(nums, nums + 5)); } + // test for forward iterators + { + // std::list provides bidirectional iterators and + // std::forward_list is in C++11, so we will use + // our list pool to test this. + + list_pool pool; + + // Create a list "1 2 3", + typename list_pool::list_type l = pool.end(); + l = pool.allocate(3, l); + l = pool.allocate(2, l); + l = pool.allocate(1, l); + + typedef typename list_pool::iterator I; + I first(pool, l); + I last(pool); + + // do a few rotates and print the results + rotate_right_by_one(first, last); + print_range(first, last); + + rotate_right_by_one(first, last); + print_range(first, last); + + rotate_right_by_one(first, last); + print_range(first, last); + } + + return 0; } diff --git a/code/test_list_pool.cpp b/code/test_list_pool.cpp new file mode 100644 index 0000000..2639da3 --- /dev/null +++ b/code/test_list_pool.cpp @@ -0,0 +1,29 @@ +#include +#include +#include "list_pool.h" + +int main() { + typedef typename list_pool::list_type L; + + list_pool pool; + + // Create a list "a b c", + // same as Lisp (cons a (cons b (cons c '()))). + L l = pool.end(); + l = pool.allocate('c', l); + l = pool.allocate('b', l); + l = pool.allocate('a', l); + + // Print contents of list. + L current = l; + while (current != pool.end()) { + std::cout << pool.value(current) << " "; + current = pool.next(current); + } + std::cout << std::endl; + + // Pool will cleanup automatically when out of scope. + // But, let's go ahead and test free. + free_list(pool, current); +} + diff --git a/code/test_list_pool_iterator.cpp b/code/test_list_pool_iterator.cpp new file mode 100644 index 0000000..3cbb7e0 --- /dev/null +++ b/code/test_list_pool_iterator.cpp @@ -0,0 +1,24 @@ +#include +#include +#include +#include "list_pool.h" + +int main() { + list_pool pool; + + // Create a list "a b c", + // same as Lisp (cons a (cons b (cons c '()))). + typename list_pool::list_type l = pool.end(); + l = pool.allocate('c', l); + l = pool.allocate('b', l); + l = pool.allocate('a', l); + + typedef typename list_pool::iterator I; + I first(pool, l); + I last(pool); + + // Print contents of list. + std::copy(first, last, std::ostream_iterator(std::cout, " ")); +} + + diff --git a/code/test_min.cpp b/code/test_min.cpp new file mode 100644 index 0000000..ded456a --- /dev/null +++ b/code/test_min.cpp @@ -0,0 +1,10 @@ +#include +#include "minmax.h" + +int main() { + assert(min(1, 2) == 1); + assert(min(2, 1) == 1); + + assert(max(1, 2) == 2); + assert(max(2, 1) == 2); +} diff --git a/code/test_minmax.cpp b/code/test_minmax.cpp index 79536f7..66fd821 100644 --- a/code/test_minmax.cpp +++ b/code/test_minmax.cpp @@ -49,19 +49,21 @@ std::pair minmax_comparisons(I first, I last) { typedef typename std::iterator_traits::value_type T; std::pair result; std::pair m0, m1; + + std::vector seq(first, last); { - std::vector seq(first, last); comparisons = 0; m0 = ::minmax_element_simple(seq.begin(), seq.end(), counting_less); result.first = comparisons; } { - std::vector seq(first, last); comparisons = 0; m1 = ::minmax_element(seq.begin(), seq.end(), counting_less); result.second = comparisons; } - if (m0 != m1) std::cout << "Failed: different mins or maxs\n"; + if (m0 != m1) { + std::cout << "Failed: different mins or maxs\n"; + } return result; } diff --git a/build_with_docker.sh b/docker_build.sh similarity index 100% rename from build_with_docker.sh rename to docker_build.sh diff --git a/index.html b/index.html index 0eb6b9e..4d2b6f7 100644 --- a/index.html +++ b/index.html @@ -2,116 +2,9 @@ + Efficient Programming with Components - + @@ -131,7 +24,7 @@ Course by: Alexander A. Stepanov (2013)
    -Notes by: Justin Meiners (2021) +Notes by: Justin Meiners (2021)
    Source: GitHub @@ -266,7 +159,7 @@
      -
    • 6. Ordering, min, and max. +
    • 6. Ordering, min, and max
      • Learning to design code
      • Reviewing Total Orderings
      • @@ -314,10 +207,10 @@
      • List pool
      • @@ -330,12 +223,16 @@
      • 9. Iterators
        • History of iterators
        • +
        • Affiliated types for iterators + +
        • List pool iterators
        • @@ -356,16 +253,18 @@
        • Alice in wonderland
        • Smallest and second smallest element
        • Binary counting and reduction
        • -
        • Binary counter object +
        • Binary counter class
          • Start with algorithms
          • Counter storage
          • @@ -379,7 +278,8 @@
            • 11. Smallest and second-smallest element
                -
              • Program design approach
              • +
              • Write code backward
              • +
              • Overview
              • Combining binary counter and list pool @@ -481,7 +384,7 @@
                • The measure of a good programmer
                • How good is our stable sort?
                • -
                • A plan for improvement. +
                • A plan for improvement
                  • Requirements
                  • Ideas to explore
                  • @@ -539,12 +442,13 @@
                  • Linear insertion sort
                  • -
                  • Sentinel version +
                  • Sentinel insertion sort
                  • Selection sort
                  • diff --git a/papers/concepts_proposal.pdf b/papers/concepts-proposal.pdf similarity index 100% rename from papers/concepts_proposal.pdf rename to papers/concepts-proposal.pdf diff --git a/papers/evolving-a-language.pdf b/papers/evolving-a-language.pdf new file mode 100644 index 0000000..6edfaca Binary files /dev/null and b/papers/evolving-a-language.pdf differ diff --git a/papers/regular-expressions.pdf b/papers/regular-expressions.pdf new file mode 100644 index 0000000..18b05b4 Binary files /dev/null and b/papers/regular-expressions.pdf differ diff --git a/template/cover.html b/template/cover.html index b55b363..195d279 100644 --- a/template/cover.html +++ b/template/cover.html @@ -2,7 +2,7 @@ Course by: Alexander A. Stepanov (2013)
                    -Notes by: Justin Meiners (2021) +Notes by: Justin Meiners (2021)
                    Source: GitHub diff --git a/template/prefix.html b/template/prefix.html index 036f34c..4960ae0 100644 --- a/template/prefix.html +++ b/template/prefix.html @@ -2,116 +2,9 @@ + $TITLE - + diff --git a/template/style.css b/template/style.css new file mode 100644 index 0000000..879661e --- /dev/null +++ b/template/style.css @@ -0,0 +1,286 @@ +html { + font-size: 100%; + overflow-y: scroll; + -webkit-text-size-adjust: 100%; + -ms-text-size-adjust: 100%; +} + +body { + color: #444; + font-family: Georgia, Palatino, 'Palatino Linotype', Times, 'Times New Roman', serif; + font-size: 16px; + line-height: 1.5em; + padding: 1em; + margin: auto; + max-width: 42em; + background: #fefefe; +} + +a { + color: #0645ad; + text-decoration: none; +} + +a:visited { + color: #0b0080; +} + +a:hover { + color: #06e; +} + +a:active { + color: #faa700; +} + +a:focus { + outline: thin dotted; +} + +a:hover, +a:active { + outline: 0; +} + +::-moz-selection { + background: rgba(255, 255, 0, 0.3); + color: #000; +} + +::selection { + background: rgba(255, 255, 0, 0.3); + color: #000; +} + +a::-moz-selection { + background: rgba(255, 255, 0, 0.3); + color: #0645ad; +} + +a::selection { + background: rgba(255, 255, 0, 0.3); + color: #0645ad; +} + +p { + margin: 1em 0; +} + +img { + max-width: 100%; +} + +h1, +h2, +h3, +h4, +h5, +h6 { + font-weight: normal; + color: #111; + line-height: 1em; +} + +h4, +h5, +h6 { + font-weight: bold; +} + +h1 { + font-size: 2.5em; +} + +h2 { + font-size: 2em; +} + +h3 { + font-size: 1.5em; +} + +h4 { + font-size: 1.2em; +} + +h5 { + font-size: 1em; +} + +h6 { + font-size: 0.9em; +} + +blockquote { + color: #666666; + margin: 0; + padding-left: 3em; + border-left: 0.5em #eee solid; +} + +hr { + display: block; + border: 0; + border-top: 1px solid #aaa; + border-bottom: 1px solid #eee; + margin: 1em 0; + padding: 0; +} + +pre, +code, +kbd, +samp { + color: #000; + font-family: monospace, monospace; + _font-family: 'courier new', monospace; + font-size: 0.98em; +} + +pre { + white-space: pre; + white-space: pre-wrap; + word-wrap: break-word; +} + +b, +strong { + font-weight: bold; +} + +dfn { + font-style: italic; +} + +ins { + background: #ff9; + color: #000; + text-decoration: none; +} + +mark { + background: #ff0; + color: #000; + font-style: italic; + font-weight: bold; +} + +sub, +sup { + font-size: 75%; + line-height: 0; + position: relative; + vertical-align: baseline; +} + +sup { + top: -0.5em; +} + +sub { + bottom: -0.25em; +} + +ul, +ol { + margin: 1em 0; + padding: 0 0 0 2em; +} + +li p:last-child { + margin: 0; +} + +dd { + margin: 0 0 0 2em; +} + +img { + border: 0; + -ms-interpolation-mode: bicubic; + vertical-align: middle; +} + +table { + border-collapse: collapse; + border-spacing: 0; +} + +td { + vertical-align: top; +} + +@media print { + * { + background: transparent !important; + color: black !important; + filter: none !important; + -ms-filter: none !important; + } + + body { + font-size: 12pt; + max-width: 100%; + } + + a, + a:visited { + text-decoration: underline; + } + + hr { + height: 1px; + border: 0; + border-bottom: 1px solid black; + } + + a[href]:after { + content: " (" attr(href) ")"; + } + + abbr[title]:after { + content: " (" attr(title) ")"; + } + + .ir a:after, + a[href^="javascript:"]:after, + a[href^="#"]:after { + content: ""; + } + + pre, + blockquote { + border: 1px solid #999; + padding-right: 1em; + page-break-inside: avoid; + } + + tr, + img { + page-break-inside: avoid; + } + + img { + max-width: 100% !important; + } + + @page :left { + margin: 15mm 20mm 15mm 10mm; + } + + @page :right { + margin: 15mm 10mm 15mm 20mm; + } + + p, + h2, + h3 { + orphans: 3; + widows: 3; + } + + h2, + h3 { + page-break-after: avoid; + } +}