[BrianWall-ChessList] Opening Master details for Brian Wall

Brian Wall BrianWallChess3 at taom.com
Mon Aug 10 22:21:22 MDT 2009



----- Forwarded message from Jon Fortune <JFortune at hsri.org> -----
    Date: Mon, 10 Aug 2009 12:17:00 -0700
    From: Jon Fortune <JFortune at hsri.org>
Reply-To: Jon Fortune <JFortune at hsri.org>
 Subject: Opening Master details for Brian Wall
      To: Brian Wall <brianwallchess3 at taom.com>, "BrianWallChess4 at Yahoo.com"
<BrianWallChess4 at Yahoo.com>, "Andrew M. Smith" <drewmister at bresnan.net>,
"George W. Lundy III (ID)" <tdmlundy at juno.com>

Hi Zarkon,

Thank you for your reply and your question. We've been waiting for this and you
represent many other people who were asking us this question (offline). How in
the world we could have 1.4 mil more games as one of the strongest and biggest
databases until now - Chess Base Mega. How can it be we have 5.2 mil and they
have 3.8 mil.?  (This is Opening Master seen at the Internet at
http://openingmaster.com/Forums/15-About-Databases/114-RE-Database-sources.html
 Jon Fortune)

Let me explain you in brief:

I personally play chess more than 30 years. The team around Opening Master is
also composed out of international masters who know what to do. First I started
as a practical chess player and then I moved to correspondence plays due to time
limitations. I wrote a short article which is published on our web page: History
of the chess databases which I recommend to read
http://www.openingmaster.com/index.php?/Chess-Databases-articles/History-of-Chess-Databases.html

The whole chess database era I would divide into 3 sections:

1) Before computers (books and notes)
2) Computer era with starting computer files like PGN and other experiments
3) Internet era and commercial programs using most advanced files

Each era had its own specifics of collection of data and collectors of data who
knew each other. The best known was Lars Balzer who today sells 3 full DVDs of
data. The only mistake it has, it's huge in terms of number of files. Approx.
40,000. Who cannot analyze such number into one file is finished so to say and
believe it or not it's not an easy task to do. For mass handling and analyzing
of files, the usage of Chess Assistant is appropriate who can in one operation
connect approx 2,000 files. This is however different topic.

Who downloaded and saw the sample database of OM A00 Irregular openings at
http://www.openingmaster.com/index.php?/Opening-Master-FREE-Pack/View-category.html
can see 52,000 games but also see the names are not 'normalized' except bigger
group of GMs i.e. you can see sometimes first name sometime last name. That is
because we use different sources than for example Chess Base Mega (sometime we
occasionally use the same source and here the data should match).

Believe it or not, at the beginning of the project around 4 years ago when we
completed and glued all data together from all INTERNET sources we came up with
number 30.000,000 games. We had everything there. The analysis even with best
computers was not possible, it was crashing down so had to split the files into
5.000.000 segments and continue the analysis. With de-duplication analysis the
number was smaller and smaller.
After that we moved to another stage: "what shouldn't be there". First went out
the computer games. Even these days, there are web portals where you can
immediately download 3.5 mil of computer games. The only and the biggest
problem is their quality. The engines nowadays which improved dramatically
compared to few years ago, are still struggling with the end games until they
move in to so called 'TableBase' and this is the tragedy.
So the issue with databases is not only about the quantity but mainly about the
quality. Everybody has different attitude towards the collection.
The aim of this database is to provide the best quality reports using the best
chess database programs. And it does! We look at the system from the pragmatic
point of view... what would chess players like us want to see in the database
and what not. And if we don't want to see something, we consult it with other
professionals and if we are at the same opinion we remove it out or leave it
in.
So as you see it's not blind copy paste of everything what you see.

And the method of acquiring data? Well, we sit ¾ of the day behind the computers
and search the web, search small sites and big sites, everything perfectly legal
(as you know there are no copyright or trademark claims on each game played as
it is considered as "event") . All legal games which were officially published
and have the quality we collect. So we can call ourselves collectors and
analyzers.
For instance Balzer data files.. All analyzed and transferred into one data
file, extremely time consuming job. There are many forms of collections and
analysis and it requires daily involvement and work with data.

I think you understood by now, this database is not created in short time frame
and also not easily. Only the method of analysis, filtering and de-duplication
is our own know-how. We know databases of 8 millions, 11 millions but they are
anywhere close to the quality we offer. (remember above what we said we had at
the beginning - 30,000,000 games)

Therefore we claim our database is the largest "quality" database on Earth.
Somebody may say, there are also junior games. We discussed this among our team
and decided to keep it in for the values add. Today's juniors play like seniors
used to play 40 years ago. They have special trainers, they learn very fast and
using the right techniques they win the championships.

Everything is about every day work, passion for many years and they you can
create a quality database.

Best Regards,
Alexander Horvath
SIM ICCF

-------------------------------------------------------
Human Services Research Institute
Jon Fortune, Senior Policy Specialist
7420 SW Bridgeport Road Suite 210
Portland, Oregon 97224
Phone 503.924.3783 xt 13
Fax 503.924.3789
Cell 503.307.1188





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.taom.com/pipermail/brianwall-chesslist/attachments/20090810/a2d72c90/attachment.htm 


More information about the BrianWall-ChessList mailing list