SECTIONS
SPI-t010
rev 5/18/2022 15:28
|
SPI-0800.cfm rev 05/16/2022 13:00
SPORTS PAGE DOCUMENTATION
{ts '2023-09-21 14:59:06'}
RUN THE SYSTEM
To Run the system:
http://www.mmcctech.com/sportspage
SCRIPTS
Scripts are found on the server in:
/d/Websites/bcra-mlscom/mmcctech/SportsPage
mySQL DATABASE TABLES
Tables are stored in the MyBayCity.com database.
Sample query:
<cfquery name="SPIq_Stories" datasource="MyBayCity" >
SELECT * FROM mbc_tsportspage_idx
WHERE SPI_Type = 'P'
ORDER-BY SPI_IssueDate, SPI_Issue_Page
</cfquery>
DATA FILES
Lowest level:
TXT file containing HTML rendering of a scanned newspaper page.
yyyy-mm/12-06-1976_THE BAY COUNTY SPORTS PAGE01.txt
OJ had all of the newspaper pages scanned as an image files,
one file per page.
Those image files were run through an optical character recognition engine to make an HTML file.
(The conversion of the scanned page-image files produced passable HTML code.
We think that pictures were identified and saved separately as image files.
The full page-image files are available, but not used in the web system.
It's just too much volume!
Steve renamed the HTML file to TXT.
Those TXT files are on a thumb drive in folders by year and month.
(Some folders have multiple months.)
The folders were uploaded to the web server. They were run through a conversion script
and written to the database,
one record per page.
The file name format is significant:
It must start with the issue date in the format mm-dd-yyyy.
It must include the page number in the format PAGEnn.
For example:
12-06-1976_THE BAY COUNTY SPORTS PAGE01.txt
These files are REMOVED from the web server once the conversion is complete.
Contact Sheets:
One file per 16 or 24 page issue.
Every issue has a single thumbnail of all 16 or 24 pages.
That thumbnail will be uploaded to the server.
The link attached to an individual page will point to the
thumbnail/contact sheet
of the issue the page appears in.
When a user does a SEARCH, the script will return a table of ALL pages on which
the search terms appear.
These results are used to show a single copy of EACH
thumbnail/contact sheet
at the bottom of the page.
Each thumbnail/contact sheet's "link" will point to the web location where the
page is for sale. Typically an Amazon Kindle book.
You can see the contact sheet from several places.
Use VIEW from the "Page List" script (SPI-0100.cfm).
Old style file name: 1976_07_19_Contact_Sheet.jpg
New style file name: CS_1976_08_02.jpg
Contact sheets are LEFT on the web server after the conversion is complete.
They are shows to the user in many places.
We think they will represent what is sold to customers.
Raw data FOLDER and FILE EXAMPLES:
(This will all be deleted once conversion is done)
Two folders that could contain raw data files:
/d/Websites/bcra-mlscom/mmcctech/SportsPage/1976-08-09
1976 Month 8 and 9 (Aug - Sept)
/d/Websites/bcra-mlscom/mmcctech/SportsPage/1976-10-11
1976 Month 10 and 11 (Oct - Nov)
A folders containing some raw data files:
/d/Websites/bcra-mlscom/mmcctech/SportsPage/1976-12
1976 Month 12 (Dec)
08-02-1976_THE BAY COUNTY SPORTS PAGE-BkMrk.txt
12-06-1976_THE BAY COUNTY SPORTS PAGE01.txt
12-06-1976_THE BAY COUNTY SPORTS PAGE02.txt
THROUGH
12-06-1976_THE BAY COUNTY SPORTS PAGE24.txt
|
DATABASE
The primary data used by the Sports Page system is a single table in the
MyBayCity.com database.
This single table contains TWO types of records:
Type P
One record for every PAGE.
Type T
One record for every THUMBNAIL / CONTACT PAGE on the server.
Each record contains
Unique ID
Issue and the page number,
Type (P or T)
Text name of the issue
LINK
Type P link points to the Thumbnail / contact page image file, and the matching T record.
Type T link points to the URL where the issue is sold.
Data which is searched.
DataSource: MyBayCity
Table: mbc_tsportspage_idx
Lowest level of data -
One entry for each page from every issue.
|
Fields |
|
SPI_ProgramID |
Auto increment unique identifier to a page. |
SPI_Type |
ONE byte text field indicating the record type:
P Single page with link to issue contact sheet.
T A single, 16-page issue description with link to where sold.
|
SPI_Issue |
50 byte text field that is the name of the page |
SPI_Link |
Link to the thumbnail image for the entire issue
Each page from the same issue points to the same ISSUE THUMBNAIL page.
Having the link associated with the page, any page COULD point somewhere else.
|
SPI_Data |
"Text" block containing a summary of the page
This is generated from the page scan,
which is then read by an OCR-like processor.
|
WHAT's FOR SALE
It's not clear exactly what will be sold.
We THINK that each issue (normally 16 pages) will be published as a Kindle book.
A single image (JPG) "contact sheet" containing thumbnail sized images of each
of the 16 pages in that issue will be stored on the server.
A description of that image will be found in the database as record type "T".
That contact sheet image entry will include a link pointing
to wherever that issue is sold.
(For testing, they all point to Kent's "James Milton" book.)
SO... let's says that the name "Steve" appears on four individual pages.
Two of those pages are in the 7/5/76 issue and two are in the 7/19/76 issue.
Following the search the system will show a table of the four references
and the page and issue the reference appears in.
Following that table of references the system will show the contact sheet image
for each of the two issues.
Each contact page image will be a link, which will to to the internet address
where the "book" of that issue can be purchased.
The customer will click the link of the issue/book they want and
make the purchase.
METHODS
How can I read a simple text file, processing each line of the file?
ColdFusion makes it easy to read a file using the <cfloop> tag.
By using the file attribute,
you can tell <cfloop> to iterate over each line of a file.
This sample reads in a text file and displays each line:
<cfset myfile = expandPath("./dump.txt")>
<cfloop index="line" file="#myfile#">
<cfoutput>
The current line is #line# </cfoutput>
</cfloop>
This question was written by Hal Helms
It was last updated on July 1, 2008.
Found at:
https://www.coldfusioncookbook.com/entries/How-can-I-read-a-simple-text-file-processing-each-line-of-the-file.html
END OF RIGHT COLUMN, ROW, Table (in t900 rev 05/08/2022 13:46 )
|
application.cfm for SportsPage. rev: 2022/04/27 11:15
SPI-t010.cfm
rev 5/18/2022 15:28 now {ts '2023-09-21 14:59:06'} SPI_TickBegin 1,695,322,746,021
SPI_template:
SPI-0800.cfm--- end of t010 startup ---
Sports Page SPI-0800.cfm DOCUMENTATION screen.
SPIc-t900 rev 4/25/2020 10:38
now: {ts '2023-09-21 14:59:06'}
SPI_template [
SPI-0800.cfm]
Tick counts from t010 |
SPI_TickNow | 1,695,322,746,021 |
SPI_TickBegin | 1,695,322,746,021 |
SPI_Elapsed | 0 |