Need assistance in running 16-threaded superheavy fuzzy search

Sanmayce

Honorable
Oct 30, 2013
11
0
10,520
Hi.

I was hesitant for a day or two to write this or not but I really want to see the numbers out of ENWIKI (the whole 2014-October dump) test.

I have access only to two laptops with Core 2 (T7500&Q9550s) with 2/4 threads respectively, the thing that interests me a lot is to see how a 5960 performs being able to run 16 threads.

The log below is for fuzzy search within Levenshtein Distance 4 for pattern "Silvestor Staloune".

Guess most people would made up to 4 typos, that is why 4:
"Silvestor Staloune" has:
replaced 'i' instead of 'y'
replaced 'o' instead of 'e'
deleted 'l'
added 'u'
The correct name is:
Sylvester Stallone

I spotted in the resultant file following typos (outside the redirect tag):
Sylvester Stalone
Sylvestor Stallone
Slvester Stallone
Silvester Stallone

Obviously even Wikipedia is not proofed fully, guess an ocean of people would misspell the name of the beloved actor as well.

// Test on laptop with Q9550s 2833MHz, 4/4 cores/threads, Windows 7 64bit:
/*
D:\_KAZE\GameraWikipediaWiktionary>type Kazahana_2014-Dec-04\Kazahana_compile_GCC.bat
gcc -O3 -funroll-loops -static -o Kazahana_Hexadecad_GCC_472 Kazahana_r1-++fix+nowait_critical_nixFIX_WolfRAM+fixITER+EX+CS_fix_DEFINE.c -fopenmp -DCommence_OpenMP -D_FILE_OFFSET_BITS=64 -D_gcc_mumbo_jumbo_
gcc -O3 -funroll-loops -static -o Kazahana_Monad_GCC_472 Kazahana_r1-++fix+nowait_critical_nixFIX_WolfRAM+fixITER+EX+CS_fix_DEFINE.c -fopenmp -D_FILE_OFFSET_BITS=64 -D_gcc_mumbo_jumbo_

D:\_KAZE\GameraWikipediaWiktionary>type Kazahana_2014-Dec-04\Kazahana_compile_Intel12_64bit.bat
icl /O3 /arch:SSE2 /QxSSE2 /Qunroll /MT Kazahana_r1-++fix+nowait_critical_nixFIX_WolfRAM+fixITER+EX+CS_fix_DEFINE.c /FAcs /FeKazahana_r1-++fix+nowait_critical_nixFIX_WolfRAM+fixITER+EX+CS_fix_DEFINE_HEXADECAD-Threads_IntelV12_SSE2_64bit /Qopenmp /Qopenmp-link:static -DCommence_OpenMP -D_icl_mumbo_jumbo_
icl /O3 /arch:SSE2 /QxSSE2 /Qunroll /MT Kazahana_r1-++fix+nowait_critical_nixFIX_WolfRAM+fixITER+EX+CS_fix_DEFINE.c /FeKazahana_r1-++fix+nowait_critical_nixFIX_WolfRAM+fixITER+EX+CS_fix_DEFINE_MONAD-Thread_IntelV12_SSE2_64bit -D_icl_mumbo_jumbo_

D:\_KAZE\GameraWikipediaWiktionary>timer32.exe Kazahana_Hexadecad_GCC_472.exe 4e "Silvestor Staloune" enwiki-20141008-pages-articles.xml 11263
Kazahana, a superfast exact & wildcards & Levenshtein Distance (Wagner-Fischer) searcher, r. 1-++fix+nowait_critical_nixFIX_Wolfram+fixITER+EX+CS_fix, copyleft Kaze 2014-Nov-19.
Pattern: Silvestor Staloune
omp_get_num_procs( ) = 4
omp_get_max_threads( ) = 4
Enforcing HEXADECAD i.e. hexadecuple-threads ...
Allocating Master-Buffer 11263KB ... OK
\; 00,000,001,376 bytes/clock
Kazahana: Total/Checked/Dumped xgrams: 800,855,553/342,059,464,575/2,106
Kazahana: Performance: 1 KB/clock
Kazahana: Performance: 21 xgrams/clock
Kazahana: Performance: Total/fread() clocks: 36,459,222/1,379,563
Kazahana: Performance: I/O time, i.e. fread() time, is 3 percents
Kazahana: Done.

Kernel Time = 38.345 = 0%
User Time =136250.493 = 373%
Process Time =136288.838 = 373% Virtual Memory = 14 MB
Global Time = 36460.185 = 100% Physical Memory = 16 MB

D:\_KAZE\GameraWikipediaWiktionary>dir Kazahana.txt
Volume in drive D is S640_Vol5
Volume Serial Number is 5861-9E6C

Directory of D:\_KAZE\GameraWikipediaWiktionary

12/03/2014 01:10 PM 1,064,420 Kazahana.txt
1 File(s) 1,064,420 bytes
0 Dir(s) 63,694,749,696 bytes free

D:\_KAZE\GameraWikipediaWiktionary>timer32.exe Kazahana_r1-++fix+nowait_critical_nixFIX_WolfRAM+fixITER+EX+CS_fix_DEFINE_HEXADECAD-Threads_IntelV12_SSE2_64bit 4e "Silvestor Staloune" enwiki-20141008-pages-articles.xml 11263
Kazahana, a superfast exact & wildcards & Levenshtein Distance (Wagner-Fischer) searcher, r. 1-++fix+nowait_critical_nixFIX_Wolfram+fixITER+EX+CS_fix_DEFINE, copyleft Kaze 2014-Dec-03.
Pattern: Silvestor Staloune
omp_get_num_procs( ) = 4
omp_get_max_threads( ) = 4
Enforcing HEXADECAD i.e. hexadecuple-threads ...
Allocating Master-Buffer 11263KB ... OK
\; Speed: 00,000,002,001 bytes/clock; Traversed: 50,144,448,379 bytes
Kazahana: Total/Checked/Dumped xgrams: 800,855,553/342,059,464,575/2,106
Kazahana: Performance: 1 KB/clock
Kazahana: Performance: 31 xgrams/clock
Kazahana: Performance: Total/fread() clocks: 25,073,428/602,292
Kazahana: Performance: I/O time, i.e. fread() time, is 2 percents
Kazahana: Performance: RDTSC I/O time, i.e. fread() time, is 1,704,219,997,078 ticks
Kazahana: Done.

Kernel Time = 284.670 = 1%
User Time = 92233.204 = 367%
Process Time = 92517.875 = 368% Virtual Memory = 17 MB
Global Time = 25073.682 = 100% Physical Memory = 16 MB

D:\_KAZE\GameraWikipediaWiktionary>dir Kazahana.txt
Volume in drive D is S640_Vol5
Volume Serial Number is 5861-9E6C

Directory of D:\_KAZE\GameraWikipediaWiktionary

12/04/2014 08:51 AM 1,064,420 Kazahana.txt
1 File(s) 1,064,420 bytes
0 Dir(s) 67,609,645,056 bytes free

D:\_KAZE\GameraWikipediaWiktionary>
*/

If you want to reproduce the benchmark and thus help me the enwiki file is downloadable at:
http://dumps.wikimedia.org/enwiki/
http://dumps.wikimedia.org/enwiki/20141008/enwiki-20141008-pages-articles.xml.bz2

It is 11GB compressed and 50GB decompressed.
The source and executable are downloadable at:
Intel Developer Zone

I hope here on TH is a guy like me interested in high-speeds and heavy textual tortures.
Simply the test is natively 16-threaded and cries for 16-threaded CPU.
 
Solution
To clarify my call for assistance the needed stuff is in form of a walkthrough:

Step #1: Download the searcher at:
https://software.intel.com/sites/default/files/managed/3d/8f/Kazahana_Intel15_parallel-for.zip

Step #2: Download the compressed testdata at:
http://dumps.wikimedia.org/enwiki/20141008/enwiki-20141008-pages-articles.xml.bz2

Step #3: Double-click on 'MokujIN 224 prompt.lnk', it will start console prompt.

Step #4: Go to the folder containing BOTH the decompressed Wikipedia and Kazahana and run 'TORTURE.BAT'.

Step #5: Share the results.

My desire is to lay my sore eyes on Haswell-E results.
I am so crazy for speed that second executable can run more than 16 threads, just for those guys who think that 5960X is the top of...

Sanmayce

Honorable
Oct 30, 2013
11
0
10,520
To clarify my call for assistance the needed stuff is in form of a walkthrough:

Step #1: Download the searcher at:
https://software.intel.com/sites/default/files/managed/3d/8f/Kazahana_Intel15_parallel-for.zip

Step #2: Download the compressed testdata at:
http://dumps.wikimedia.org/enwiki/20141008/enwiki-20141008-pages-articles.xml.bz2

Step #3: Double-click on 'MokujIN 224 prompt.lnk', it will start console prompt.

Step #4: Go to the folder containing BOTH the decompressed Wikipedia and Kazahana and run 'TORTURE.BAT'.

Step #5: Share the results.

My desire is to lay my sore eyes on Haswell-E results.
I am so crazy for speed that second executable can run more than 16 threads, just for those guys who think that 5960X is the top of the cake.
 
Solution