| 1 | +The following are samples of of 13-grams that occurred in our filtered Common Crawl dataset and how many times they were seen. The samples show that the most frequent 13-grams tend to be junk, while the least frequent appear to be reasonable content. |
| 2 | + |
| 3 | +### Top 13-grams |
| 4 | +``` |
| 5 | +159833 all rights reserved this material may not be published broadcast rewritten or redistributed |
| 6 | +61887 1 2 3 4 5 6 7 8 9 10 11 12 13 |
| 7 | +47145 2 3 4 5 6 7 8 9 10 11 12 13 14 |
| 8 | +45532 3 4 5 6 7 8 9 10 11 12 13 14 15 |
| 9 | +42200 a b c d e f g h i j k l m |
| 10 | +36025 e f g h i j k l m n o p q |
| 11 | +35635 b c d e f g h i j k l m n |
| 12 | +33933 4 5 6 7 8 9 10 11 12 13 14 15 16 |
| 13 | +33532 c d e f g h i j k l m n o |
| 14 | +33516 d e f g h i j k l m n o p |
| 15 | +32437 associated press all rights reserved this material may not be published broadcast rewritten |
| 16 | +28406 0 0 0 0 0 0 0 0 0 0 0 0 0 |
| 17 | +28382 5 6 7 8 9 10 11 12 13 14 15 16 17 |
| 18 | +28329 7 8 9 10 11 12 13 14 15 16 17 18 19 |
| 19 | +``` |
| 20 | + |
| 21 | +### N=1001 samples |
| 22 | +``` |
| 23 | +1080 any form or by any means electronic mechanical photocopying recording or otherwise without |
| 24 | +1071 4 4 4 4 4 4 4 4 4 4 4 4 4 |
| 25 | +1028 denmark djibouti dominica dominican republic east timor ecuador egypt el salvador equatorial guinea |
| 26 | +1020 name of the father and of the son and of the holy spirit |
| 27 | +1016 shall be deprived of life liberty or property without due process of law |
| 28 | +1014 i m going to go out on a limb here and say that |
| 29 | +1012 creating new collective works for resale or redistribution to servers or lists or |
| 30 | +1008 country can do for you ask what you can do for your country |
| 31 | +1006 1 0 1 0 1 0 1 0 1 0 1 0 1 |
| 32 | +1006 your country can do for you ask what you can do for your |
| 33 | +1001 ask not what your country can do for you ask what you can |
| 34 | +``` |
| 35 | + |
| 36 | +### N=101 samples |
| 37 | +``` |
| 38 | +101 6 chapter 7 chapter 8 chapter 9 chapter 10 chapter 11 chapter 12 |
| 39 | +101 place where when you have to go there they have to take you |
| 40 | +101 including the right to reproduce this book or portions thereof in any form |
| 41 | +101 there was with the angel a multitude of the heavenly host praising god |
| 42 | +101 unusual and extraordinary threat to the national security foreign policy and economy of |
| 43 | +101 to this world but be ye transformed by the renewing of your mind |
| 44 | +101 limiting climate change will require substantial and sustained reductions of greenhouse gas emissions |
| 45 | +101 the 16 bit one s complement of the one s complement sum of |
| 46 | +101 observatory is a facility of the national science foundation operated under cooperative agreement |
| 47 | +101 be at the end of the world the angels shall come forth and |
| 48 | +101 you and i want to spend the rest of my life with you |
| 49 | +``` |
| 50 | + |
| 51 | +### N=50 samples |
| 52 | +```50 eternal life for god so loved the world that he gave his only |
| 53 | +50 grow not old as we that are left grow old age shall not |
| 54 | +50 higher education in the united states and one of the nine colonial colleges |
| 55 | +50 for the prosecution of persons responsible for serious violations of international humanitarian law |
| 56 | +50 against you in a court of law you have the right to an |
| 57 | +50 2 9 2 10 2 11 2 12 2 13 2 14 2 |
| 58 | +50 was given unto them over the fourth part of the earth to kill |
| 59 | +50 those days and also after that when the sons of god came in |
| 60 | +50 the department of the army department of defense or the u s government |
| 61 | +50 heated and transforms into steam within a boiler operating at a high pressure |
| 62 | +50 either die a hero or you live long enough to see yourself become |
| 63 | +50 niger nigeria norway oman pakistan panama papua new guinea paraguay peru philippines poland |
| 64 | +50 not alter our adherence to plos one policies on sharing data and materials |
| 65 | +50 are some of the most studied and detailed periods of human history military |
| 66 | +50 what did you go out into the wilderness to see a reed shaken |
| 67 | +50 consent written informed consent was obtained from the patient for publication of this |
| 68 | +50 electronic or mechanical including photocopying recording or by any information storage or retrieval |
| 69 | +50 is the home rule municipality that is the county seat and the most |
| 70 | +50 intercellular adhesion molecule 1 icam 1 and vascular cell adhesion molecule 1 vcam |
| 71 | +50 5 6 7 8 9 10 11 12 13 14 15 16 the |
| 72 | +50 noble the owner of life savers candy drugstore chain rexall and new york |
| 73 | +50 characters settings etc are the property of their respective owners the original characters |
| 74 | +50 the years in your life that count it s the life in your |
| 75 | +50 work are fictitious any resemblance to real persons living or dead is purely |
| 76 | +50 0 0 0 1 0 0 0 0 1 1 0 0 0 |
| 77 | +50 the scientific study of language and involves an analysis of language form language |
| 78 | +50 scientific study of language and involves an analysis of language form language meaning |
| 79 | +50 in dulbecco s modified eagle medium dmem supplemented with 10 fetal bovine serum |
| 80 | +50 they have their exits and their entrances and one man in his time |
| 81 | +50 not matter how slowly you go so long as you do not stop |
| 82 | +50 after that when the sons of god came in unto the daughters of |
| 83 | +50 of wrongs love does not delight in evil but rejoices with the truth |
| 84 | +50 drummer let him step to the music which he hears however measured or |
| 85 | +50 the increase of his government and of peace there will be no end |
| 86 | +50 not delight in evil but rejoices with the truth it always protects always |
| 87 | +50 me i once was lost but now am found was blind but now |
| 88 | +``` |
| 89 | + |
| 90 | +### N=21 samples |
| 91 | +``` |
| 92 | +21 all worlds begin in darkness and all so end the heart is no |
| 93 | +21 which includes interviewers computer and laboratory technicians clerical workers research scientists volunteers managers |
| 94 | +21 message and we ll get back to you as soon as we can |
| 95 | +21 lady who s sure all that glitters is gold and she s buying |
| 96 | +21 the american academy of arts and sciences and is a fellow of the |
| 97 | +21 has received funding from the european community s seventh framework programme fp7 2007 |
| 98 | +21 of international humanitarian law committed in the territory of the former yugoslavia since |
| 99 | +21 and evil there is only power and those too weak to seek it |
| 100 | +21 20 episode 21 episode 22 episode 23 episode 24 episode 25 episode 26 |
| 101 | +21 to those who have thrice defied him born as the seventh month dies |
| 102 | +21 grenada montserrat saint kitts and nevis saint lucia saint vincent and the grenadines |
| 103 | +21 b p abbott et al ligo scientific collaboration and virgo collaboration phys rev |
| 104 | +21 although victoria is ranked fourth in terms of gsp per capita because of |
| 105 | +21 you i want to spend the rest of my life with you i |
| 106 | +21 that i happened to be in the right place at the right time |
| 107 | +21 the resulting work only under the same or similar license to this one |
| 108 | +21 cannot be created or destroyed it can only be changed from one form |
| 109 | +21 co operation among states in accordance with the charter of the united nations |
| 110 | +21 this is halloween this is halloween pumpkins scream in the dead of night |
| 111 | +21 gang aft agley an lea e us nought but grief an pain for |
| 112 | +21 i know that i want to spend the rest of my life with |
| 113 | +21 you will also suffer a defeat if you know neither the enemy nor |
| 114 | +21 of thy peace where there is hatred let me sow love where there |
| 115 | +21 israeli war was fought between june 5 and 10 1967 by israel and |
| 116 | +21 secure financial stability facilitate international trade promote high employment and sustainable economic growth |
| 117 | +21 bachelor s and master s degrees in political science from the university of |
| 118 | +21 the which if they should be written every one i suppose that even |
| 119 | +``` |
| 120 | + |
| 121 | +### N=1 samples |
| 122 | +``` |
| 123 | +1 hunched in its belly till my wet fur froze six miles from earth |
| 124 | +1 fleet and then i m going to save the earth and then just |
| 125 | +1 a and chuang i l 2000 quantum computation and quantum information cambridge university |
| 126 | +1 with all constructed languages gauging the number of speakers of ido is an |
| 127 | +1 idols you made to worship therefore i will send you into exile beyond |
| 128 | +1 to the local bar and began to party as if there was no |
| 129 | +1 jeannette rankin of montana becomes the first female member of the united states |
| 130 | +1 the land lay north of the 36 30 parallel where slavery had been |
| 131 | +1 is explored in articles such as the ultimate the absolute god and zeno |
| 132 | +1 whimper her eyes filled with tears she turned on her heel and ran |
| 133 | +1 perceive if he bends his will thither much of what is passing in |
| 134 | +1 a moment when you say to yourself oh there you are i ve |
| 135 | +1 row 2 in first class stabbed the two unarmed flight attendants who would |
| 136 | +1 for three independent experiments statistical significance was assessed with one way anova and |
| 137 | +1 basic shape the fundamental human end of the universe and here you are |
| 138 | +1 didn t think you d be here i didn t think i d |
| 139 | +1 arora c lund r motwani m sudan and m szegedy proof verification and |
| 140 | +1 of the pines while i sat grilling with my clothes stuck to the |
| 141 | +1 cells were maintained at 37 c in an incubator with a humidified atmosphere |
| 142 | +1 revenge and the allied achaean greek armies nearly lose the war in counterpoint |
| 143 | +1 sat as a chamber articles 26 29 of the statute allow the court |
| 144 | +1 i couldn t save your world i couldn t save any of them |
| 145 | +``` |
0 commit comments