NLA Trial index

NLA Trial Articles from 1809

  1. Accuracy of OCR and overProof is measured in comparison with the human corrections. We know human corrections in this sample are incomplete, and themselves contain errors, but they are the best we could find automatically from the NLA newspapers corpus, tagged as completely corrected then further filtered to those with at least 3 corrections, at least 40% of lines corrected and lowest third percentage of non-dictionary words.
  2. Accuracy is measured by a separate process from that used to colour words in this output: the colouring process is heuristic, and not completely accurate.
  3. Colour legend:
    Text - OCR text corrected by human and/or overProof
    Text - human and/or overProof corrections
    Text - discrepencies between human and/or overProof
    Text - human corrections not applied by overProof
  4. Identified overProof corrections are calculated by the statistical calculation process, and shows those words changed by overProof which ALSO match human corrections. As human corrections are often wrong and incomplete, so too is this list.
  5. Identified overProof non-corrections are calculated by the statistical calculation process, and shows those words in the overProof output which DO NOT MATCH human corrections. As human corrections are often wrong and incomplete, so too is this list. Words marked as [**VANDALISED] are those which have been changed by overProof but not by the human correction; as before, a missed human correction will be (incorrectly) classified as vandalisation by overProof.
  6. Searchability of unique words refers to the distinct words in an article, and how many are present before and after correction. It is measure of how many of the words within an article could be used to find the article using a search engine.
  7. Weighted Words refers to a calculation in which common words count for little (a fraction of a word) and unusual words count for more, in proportion to the log of the inverse of their frequency in the corpus. It may be an indicator of how well distinctive words in an article can be searched before and after correction.

Article ID 627733, Article, TO THE PUBLIC., page 2 1809-04-30, The Sydney Gazette and New South Wales Advertiser (NSW : 1803 - 1842), 221 words, 11 corrections

Raw OCRHuman CorrectedoverProof Corrected
His Honor the Lieutenant Governor HIS HONOR the LIEUTENANT GOVERNOR His Honor the Lieutenant Governor
having been pleased to appoint me to receive, and having been pleased to appoint me to receive, and having been pleased to appoint me to receive, and
attend to the due delivery of all Leiters .and Par- attend to the due delivery of all Letters and Par- attend to the due delivery of all Letters and Parcels
cels directed to individuals, Ï beg leave to inform cels directed to individuals, I beg leave to inform directed to individuals, I beg leave to inform
the Inhabitants at large, that a List of Persons the Inhabitants at large, that a List of Persons the Inhabitants at large, that a List of Persons
to whom such may be directed, will always be to whom such may be directed, will always be to whom such may be directed, will always be
conspicuously posted in front of my house,, nhich j conspicuously posted in front of my house, which conspicuously posted in front of my house,, which j
is near to the Hospital Wharf. 'And I beg leaye is near to the Hospital Wharf. And I beg leave is near to the Hospital Wharf. 'And I beg leave
to add, that every possible attention shall be paid to add, that every possible attention shall be paid to add, that every possible attention shall be paid
to the observance of punctuality, which alone can to the observance of punctuality, which alone can to the observance of punctuality, which alone can
render such au establishment generally bénéficiai, render such an establishment generally beneficial, render such an establishment generally beneficial,
or give satisfaction in the performance of its or give satisfaction in the performance of its or give satisfaction in the performance of its
duties to the Public's very respectful Sei vant, duties to the Public's very respectful Servant, duties to the Public's very respectful Servant,
I. Nichols. I. NICHOLS. I. Nichols.
The undermentioned Letters are on delivery at The undermentioned Letters are on delivery at The undermentioned Letters are on delivery at
the Office of the .Naval Officer's Assistant at the the Office of the Naval Officer's Assistant at the the Office of the Naval Officer's Assistant at the
Hospital Wharf: Hospital Wharf :— Hospital Wharf:
Thos. Kelly, Parinmatta John Jones, Settler Thos. Kelly, Parramatta, | John Jones, Settler Thos. Kelly, Parramatta John Jones, Settler
Joseph Rowley James Hunter . Joseph Rowley | James Hunter Joseph Rowley James Hunter .
Paul Leathci borough, G Charlotte Jennings Paul Leatherborough, 2 | Charlotte Jennings Paul Leather borough, G Charlotte Jennings
Foseph Pierson, Stonemason Win. Fletcher, 3 Joseph Pierson, Stonemason | Wm. Fletcher, 3 Joseph Pierson, Stonemason Wm. Fletcher, 3
Ann Parker Airs. A. Gangell Ann Parker | Mrs. A. Gangell Ann Parker Mrs. A. Gangell
Win. Milson, Parramatta John Stanton Wm. Milson, Parramatta | John Stanton Win. Wilson, Parramatta John Stanton
Dr. M'Millan Henry Otter Dr. McMillan | Henry Otter Dr. McMillan Henry Otter
John Howard, mariner John Shangall John Howard, mariner | John Shangall John Howard, mariner John Shangall
Thos. Dud eigh. Locksmith Richard Cam Thos. Dudleigh, Locksmith | Richard Cam Thos. Dud eigh. Locksmith Richard Cain
.lames Sherrard, ILiwkcsb. Alexander Bell James Sherrard, Hawkesb. | Alexander Bell James Sherrard, ILiwkcsb. Alexander Bell
Thomas Trelaiuy Air. John Fleming Thomas Trelainy | Mr. John Fleming Thomas Trelaiuy Mr. John Fleming
John Wilkinson Thomas Stevens John Wilkinson | Thomas Stevens John Wilkinson Thomas Stevens
Daniel Bryan, Settler William Knight' Daniel Bryan, Settler | William Knight Daniel Bryan, Settler William Knight'
John Cooper, Panamatta Esther M m ray John Cooper, Parramatta | Esther Murray John Cooper, Parramatta Esther M m ray
John Mooic, Carpenter Joseph Cox John Moore, Carpenter | Joseph Cox John Moore, Carpenter Joseph Cox
Sarah Stubbs, Portland Head, Sarah Stubbs, Portland Head, Sarah Stubbs, Portland Head,
Margaret Richards, Concord; Margaret Richards, Concord ; Margaret Richards, Concord;
William Lee, Shoemaker. William Lee, Shoemaker. William Lee, Shoemaker.
accuracy %
accuracy %
corrected %
All Words20689.396.163.6
Searchability of unique words14691.195.246.2
Weighted Words92.095.948.5

Accumulated stats for 1 articles from year 1809

accuracy %
accuracy %
corrected %
All Words20689.396.163.6
Searchability of unique words14691.195.246.1
Weighted Words92.095.948.8