Tuesday, June 03, 2008

Ejoty in Groovy

Ejoty is a word invented by magician Stewart James to describe the mental skill of easily remembering the numeric value of each letter of the English alphabet (i.e. A=1, B=2, ..., Z=26) to enable quick mental calculation of the value of words (e.g. WORD = W + O + R + D = 23 + 15 + 18 + 4 = 60). The letters in ejoty refer to the ordered multiples of 5 in the alphabet, i.e. E=5, J=10, O=15, T=20, Y=25, which, if we memorize those, will enable us to easily calculate the values of most other letters by using only 1 or 2 offsets.

A way musical and aural learners could use to learn the values of letters is to remember the value of the first letter in each foot of the popular children's song "abcd efg, hijk lmnop, qrs tuv, wx yz", i.e. "1 5, 8 12, 17 20, 23 25".

If we can instantly know the value of each letter, we can more easily practise adding a sequence of numbers in our heads whenever we see words written down somewhere. For example, signs we see when riding public transport:
  SYDNEY = 19 + 25 + 4 + 14 + 5 + 25 = 92
  FLINDERS = 6 + 12 + 9 + 14 + 4 + 5 + 18 + 19 = 87


Using Groovy
To more quickly ejotize words, we can learn the sums of common letter sequences off by heart. To find the most common ones, we can write a Groovy program...

After extracting the English word list english.3 within this zip file, we can run this script:

def gramVal(gr){
  def tot= 0
  gr.each{ tot += (it as int) - 96 }
  tot
}

def grams= [:]
new File("english.3").eachLine{word->
  word -= "'"
  for(int i in 2..word.size())
    if(word.size() >= i)
      for(int j in 0..word.size() - i){
        def gram= word[j..j+i-1].toLowerCase()
        if( grams[gram] != null ) grams[gram]++
        else grams[gram]= 1
      }
}

grams.entrySet().findAll{it.value > 200}
     .sort{it.value}.reverse().each{
  def gm= gramVal(it.key)
  println "$it.key ($gm): $it.value"
}


Only the sequences of letters occuring more than 200 times in that word list will be displayed by that version of the program. The first 20 lines output are:

an (15): 2634
er (23): 2606
in (23): 2080
ar (19): 1977
on (29): 1780
te (25): 1750
ra (19): 1732
en (19): 1625
al (13): 1570
ro (33): 1498
ri (27): 1485
is (28): 1472
la (13): 1444
or (33): 1426
le (17): 1425
at (21): 1404
ch (11): 1327
st (39): 1303
re (23): 1269
ti (29): 1253


(The reason the commonly-occuring th doesn't occur is because the program doesn't consider word frequencies in normal text.)

Ejoty In Reverse
Perhaps we want to easily convert numbers to letters. We could learn the letters for the numbers up to 26 easily enough, but what if we want to convert higher numbers to something. What about to groups of letters, where their sum is the number? We'd need to generate some common possibilities. This Groovy code uses the grams list we generated in the previous code sample to generate the 5 most common sequences for numbers up to 100:

def grGrams= grams.groupBy{gramVal(it.key)}
grGrams.entrySet().findAll{it.key <= 100}
       .sort{it.key}.each{
  print "$it.key (${it.value.inject(0){flo,itt-> flo+itt.value}}): "
  def set= it.value.entrySet().sort{it.value}.reverse()
  def setSz= 5
  if(set.size() >= setSz) set= set[0..setSz-1]
  println set.collect{"$it.key($it.value)" }.join(', ')
}


Here's a segment of output showing the most common letter sequences for numbers greater than 26:

27 (10243): ri(1485), lo(973), ol(955), sh(587), ve(442)
28 (11695): is(1472), th(855), si(785), mo(709), om(691)
29 (11231): on(1780), ti(1253), it(919), no(768), ell(251)
30 (7766): oo(370), ing(362), ati(245), iu(234), sk(207)
31 (7516): op(660), po(534), rm(330), vi(294), her(217)
32 (8183): sm(361), rn(276), tic(255), eri(249), mar(188)
33 (11647): ro(1498), or(1426), ul(550), hy(403), ns(355)
34 (9931): nt(950), os(793), um(504), so(424), pr(304)
35 (9482): to(1077), ot(634), un(541), ant(321), sp(282)
36 (8867): ou(788), rr(288), pt(183), ato(165), min(165)
37 (8499): rs(330), ly(267), ov(250), yl(220), pu(149)
38 (10054): tr(740), ss(458), rt(414), ion(261), qu(256)
39 (11026): st(1303), ur(745), ent(320), ru(304), per(244)
40 (9339): us(1130), su(353), tt(308), ast(225), sta(211)
41 (9141): ut(364), tu(282), ism(252), rin(191), yp(178)
42 (8756): ers(200), mon(199), olo(170), res(144), ori(132)
43 (9499): ter(534), ry(378), tin(177), nti(151), ium(138)
44 (8233): ste(219), ys(184), est(168), tio(157), sy(107)
45 (7260): ty(229), ver(182), yt(128), tte(107), aceou(104)
46 (7459): ris(147), rom(135), mor(108), orm(105), los(77)
47 (7762): sis(206), tri(180), ron(156), rit(118), ssi(107)
48 (8292): ist(236), sti(162), uri(116), tis(115), eter(114)
49 (7790): ton(202), rop(143), ont(122), pro(120), phy(120)
50 (6711): oto(117), graph(83), oun(59), tric(57), low(54)

There's plenty of choices there.

The code uses the groupBy GDK function, and the output gives a visual representation of applying it to some data.

No comments: