Town-based extractions from the Ellis Island Database #general

Ronald D. Doctor

For the past couple of years, the Kremenets District Research Group (KDRG) has been
extracting and processing records >from the Ellis Island Database that mention towns
of the Kremenets District. We now have extracted all records for the towns of
Shumskoye and Vishnevets and we have indexed two sets of un-indexed fields: Person
at Destination, and Nearest Relative at Origin.

The data are in an Excel 2007 spreadsheet, EIDB Master.xlsx, which now is available
for download on the Kremenets Shtetlinks website, The spreadsheet
has 1,041 records... 356 for immigrants >from Shumsk, 684 >from Vishnevets, and 1
(so far) >from Vyshgorodok. We soon will add almost 1,500 records >from Kremenets.
Records for other towns will be added as processing and proofreading are completed.

Please be sure to read the document titled, "Guide to EIDB records >from the
Kremenets District". It explains some fine points for using the spreadsheet. In
addition, we have posted a PowerPoint presentation given by KDRG Board Member
Susan Sobel at the 2010 Annual Memorial Meeting (Askara) of the Kremenets-Shumsk
Society in Israel. It describes some of the more interesting ways to use the newly
indexed fields.

As you peruse the spreadsheet, you will notice that (1) a large number of entries
are grossly misspelled in the EIDB transcription and therefore are very difficult
to locate by standard methods, and (2) a large number of name entries were indexed
incorrectly by the Ellis Island folks and lead to the wrong manifest. As described
below, our project has corrected both of these problems.

To extract records >from the EIDB using town names, we had to recognize the many
ways in which town names could be spelled. We used a variety of approaches to
create a glossary of town name spelling variations and used this glossary to
extract town-based records >from the EIDB. We found 81 variations in spelling for
Shumsk and 205 variations for Vishnevets. These are shown in the EIDB Master
spreadsheet, in a worksheet named "Variant spelling of town names".

Data >from the spreadsheet also are in the One-Step searchable "Indexed Concordance
of Personal Names and Town Names", on our Shtetlinks website. Be sure to read our
"Introduction and Guides ..." document. It tells you how to use the cryptic
entries that are in the Source and Location in Source columns of the Concordance.
Basically, we have used the Passenger ID (PID), a unique value for each passenger,
as the index key in the Concordance. The Passenger ID (PID) can be used in Steve
Morse's One Step Search Engine for the EIDB
( to locate the manifest image. However,
we found that 5.2% of the Ellis Island database entries we examined had PIDs that
lead to the wrong manifest. In these cases we have determined the URL that will
take you to the correct manifest. Since most of these URLs are extremely long, we
have used Google's URL shortening algorithm to provide you with shortened URLs.
These shortened URLs are in the PID column of the spreadsheet and are used in the
Location in Source column of the Concordance as an index term.

If you have any questions, please contact me ( or Susan Sobel

Ron Doctor
Co-Coordinator, Kremenets Shtetl CO-OP/Jewish Records Indexing-Poland
An activity of the Kremenets District Research Group

Kremenets, Oleksinets, Yampol, Vishnevets
and KAZDOY (KOSODOY), DUBINSKI, DUBOWSKY ... all >from Kiev, Uman, Odessa

Join to automatically receive all group messages.