Changes

Back Issue Digitizing Study Group

57 bytes added, 16:59, 24 February 2022
revision of process
<li>The next step is to process the PDF file through OCR (Optical Character Recognition) software. This can be done from the same interface that is used for the scanner. The resulting file shall be given the same name, changing the PDF suffix to RTF.</li>
<li>These two files are then uploaded given to the [[AugustanConvert YahooGroup]]. If any file is larger than 5M, it must be broken up into pieces not larger than that limitthis Study Group in such form and manner seems best.
Note: One of When the folders used to store these files original is named in poor condition, it is reasonable to flag the issue as "Urgent", and the others being named after the magazine titlesStudy Group should prioritize work on such issues. At present, the Urgent folder A cause of only slightly lower priority is "Books" which is used for magazines issues containing articles due for inclusion in a book broject, such as the articles by Arthur Germond, as it is planned to combine these into a [[Germond|book]].</li>
<li>Members of the Back Issue Digitizing Study Group then select the files they wish to work on. Starting with the RTF file, they compare it to the PDF images and make such edits as may be needed, producing a file that is (preferably) in ODT format. (If this isn't possible, DOC other plain or improved text format will do. At this point) No effort should be expended on images, no as they will be inserted later. They may be left as placeholders. No effort to reformat the text is done other than to strip required, though stripping odd page sizes, type styles, indents, frames, and spacing, though is helpful and easy to do at this isn't strictly requiredstage. It will generally be found easier to convert multiple Multiple columns shoul dbe converted into one, and articles that skip to distant pages may ; tables should generally be combinedrebuilt or left untouched. When this work is complete, the ODT (or DOCtext) file us uploaded back is returned to the AugustanConvert YahooGroupCoordinator for final formatting and insertion of improved images.</li>
<li>The Editor or Associate Editor editor assigned to work on the project will then remove all three files for an issue from the AugustanConvert YahooGroup, and begin work on the ODT file, first converting from DOC or other text format if needed. It is at this stage that formatting is imposed. Where possible, a style close to that used for ''The [[Augustan Omnibus]]'' is used, though it is often necessary to reduce change the font size so that articles will begin on the same page as the original.(This to preserve any citations or references as much as possible.)</li>
<li>Some things will need to be redacted in these copies. Obsolete addresses and prices are the primary targets, and Society addresses need to have the current address inserted nearby. No effort to add current prices should be made. Care need be taken with the copyright notice if any. Issues copyrighted by a predecessor of the Society (like the [[Octavian Society]] or the [[Hartwell Company]] need to have a new notice added with the current date. A few issues claimed copyright belonged to the individual authors; these will take substantial legal effort before we can reprint them. In all cases, a notice that it is the "Second Printing" is required, along with the year.</li>
<li>Photographs in the ODT file are generally replaced, as the OCR process is unkind to images. This may be done by pulling the same image from the PDF file, or a replacement image may be used. As many images are public domain, it is often possible to replace a black & white photo with a color photo. Depictions of arms should be colorized, provided a blazon can be found without excess effort. In the process of replacing the images, it may be found helpful to reformat some pages or resize some images. This is left to the discretion of the Editoreditor, with the understanding that the goal is to recreate the original, particularly in regard to the page things originally appear on. This is important as it permits indexes to be built that apply equally to the original and this reproduction.</li>
<li>The original PDF file and the corrected ODT file (which may also be converted to PDF for this purpose) are then sent to a [[Proofreader]] to confirm that errors weren't introduced.</li>
</ol>
It must Care needs to be admitted taken by the Coordinator that only a few small issues have gone through this entire process, and some of those skipped the scanner and were transcribed same issues aren't taken up by hand. It may very well prove that changes different members or editors to the above procedure avoid duplication of work. This will be neededbecome increasingly important as the size of this Study Group grows.
One obvious change would be for There have been proposals to convert the Coordinator back issues to assign magazines HTML for work, or at least to track them, so that two volunteers don't spend time working posting on the same issueweb site. One expects This Study Group will not undertake that to be addressed task, but will support as soon as a second transcriber joins the possible another Study Groupchartered for that purpose.
----
6,158
edits